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Rajamani  Sundar 
Advisor:  Ravi  Boppana 

Abstract 


This  thesis  investigates  the  amortized  complexity  of  some  fundamental  data  structure 
problems  and  introduces  interesting  ideas  for  proving  lower  bounds  on  amortized  com- 
plexity and  for  performing  amortized  analysis.  The  problems  are  as  follows: 

•  Dictionary  Problem:  A  dictionary  is  a  dynamic  set  that  supports  searches  of  ele- 
ments and  changes  under  insertions  and  deletions  of  elements.  It  is  open  whether 
there  exists  a  dictionary  data  structure  that  takes  constant  amortized  time  per 
operation  and  uses  space  polynomial  in  the  dictionary  size.  We  prove  that  dictio- 
nary operations  require  log-logarithmic  amortized  time  under  a  multilevel  hashing 
model  that  is  based  on  Yao's  cell  probe  model. 

•  Splay  Algorithm's  Analysis:  Splay  is  a  simple,  efficient  algorithm  for  searching 
binary  search  trees,  devised  by  Sleator  and  Tarjan,  that  uses  rotations  to  reorganize 
the  tree.  Tarjan  conjectured  that  Splay  takes  linear  time  to  process  deque  operation 
sequences  on  a  binary  tree  and  proved  a  special  case  of  this  conjecture  called  the 
Scanning  Theorem.  We  prove  tight  bounds  on  the  maximum  numbers  of  various 
types  of  right  rotations  in  a  sequence  of  right  rotations  performed  on  a  binary 
tree.  One  of  the  lower  bounds  refutes  a  conjecture  of  Sleator.  We  apply  the  upper 
bounds  to  obtain  a  nearly  linear  upper  bound  for  Tarjan's  conjecture.  We  give  two 
new  proofs  of  the  Scanning  Theorem,  one  of  which  is  a  potential-based  proof  that 
solves  a  problem  of  Tarjan. 

•  Set  Equality  Problem:  The  task  of  maintaining  a  dynamic  collection  of  sets  under 
various  operations  arises  in  many  applications.  We  devise  a  fast  data  structure 
for  maintaining  sets  under  equality-tests  and  under  creations  of  new  sets  through 
insertions  and  deletions  of  elements.  Equality-tests  require  constant  time  and  set- 
creations  require  logarithmic  amortized  time.  This  improves  previous  solutions. 
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Chapter  1 


Introduction 


1.1      Amortized  Complexity 

In  many  applications  of  data  structures,  the  data  structure  is  embedded  within  some 
algorithm  that  performs  successive  operations  on  it.  In  these  applications,  we  are  inter- 
ested only  in  the  time  taken  by  the  data  structure  to  process  operation  sequences  as  a 
whole  and  not  in  the  time  spent  on  isolated  operations.  Amortized  data  structures  are 
data  structures  tailored  to  such  applications:  these  data  structures  may  perform  poorly 
on  a  few  individual  operations  but  perform  very  well  on  all  operation  sequences.  The 
natural  performance  measure  for  an  amortized  data  structure  is  its  amortized  complexity, 
defined  to  be  the  maximum  cost  of  operation  sequences  performed  on  the  data  structure 
as  a  function  of  the  lengths  of  the  sequences.  Amortized  data  structures  are  appealing 
because  they  dispense  with  complicated  constraints  and  associated  information  present 
in  data  structures  that  achieve  a  fast  performance  on  all  operations  and  they  use  simple 
reorganizing  heuristics,  instead,  to  achieve  a  fast  amortized  performance.  Some  exam- 
ples of  these  data  structures  are  the  compressed  tree  data  structures  for  the  Union-find 
Problem  [1,23,32],  the  Splay  Tree  [23,28,32],  and  the  Pairing  Heap  [16]. 

Amortized  data  structures  are  simple  to  describe  but  their  performance  analysis  is 
often  quite  Involved.  Since  operation  sequences  on  these  data  structures  are  mixtures 
of  operations  of  varying  costs  that  very  finely  interact  with  one  another  it  is  tricky  to 
accurately  estimate  their  amortized  complexity.  Of  the  three  amortized  data  structures 
mentioned  above  only  the  first  one  has  been  analyzed  thoroughly;  even  its  analysis  was 
accomplished  only  several  years  after  the  data  structure  was  originally  conceived.   The 
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complete  aneilysis  of  the  other  two  is  still  open. 

A  useful  framework  for  performing  amortized  analysis  involves  defining  an  appro- 
priate potential  function  for  the  data  structure  [34].  In  this  framework,  each  state  of 
the  data  structure  is  assigned  a  real  number  called  its  potential.  The  amortized  cost  of 
a  data  structure  operation  is  defined  to  be  the  actual  cost  incurred  by  that  operation 
plus  the  increase  in  potential  it  causes  through  change  of  state.  The  total  cost  of  an 
operation  sequence  performed  on  the  data  structure  equals  the  total  amortized  cost  of 
the  operations  in  the  sequence  plus  the  decrease  in  potential  from  the  initial  state  to  the 
state  at  the  end  of  the  operation  sequence.  Choosing  a  suitable  potential  function  that 
yields  a  sharp  estimate  on  the  amortized  complexity  is  a  task  that  demands  ingenuity. 

We  illustrate  the  potential  framework  through  a  simple  example.  A  priority  queue 
with  attrition  (PQA)  is  a  dynamic  set  of  real  numbers  supporting  two  operations: 
Deletemin  deletes  and  returns  the  smallest  element  of  the  set;  Insert(i)  deletes  all 
elements  of  the  set  greater  than  x  and  adds  x  to  the  set.  A  PQ.4  can  be  represented 
by  a  sorted  list  of  its  elements.  In  this  representation,  Deletemin  takes  constant  time, 
and  Insert  takes  time  proportional  to  the  number  of  deleted  elements.  If  we  define  the 
potential  of  the  data  structure  equal  to  the  number  of  elements  it  contains  then  PQA 
operations  take  constant  amortized  time;  PQA  operation  sequences  take  linear  time. 

The  notions  of  amortized  complexity  and  amortized  cost  are  usually  used  in  a  much 
wider  sense  than  defined  above.  Amortized  complexity  is  usually  a  maximizing  function 
of  several  parameters  of  the  operation  sequences,  instead  of  their  length  alone.  The 
amortized  costs  of  operations  are  usually  a  set  of  functions  of  the  operation  sequence 
parameters  that,  when  added  together,  yield  a  good  estimate  of  the  amortized  complex- 
ity. For  example,  the  compressed  tree  data  structures  for  the  Union-find  Problem  have 
amortized  complexity  equal  to  0{n  +  ma(m  -\-  n,n)),  and  take  constant  time  on  Union 
operations  and  0{a(m  +  n,n})  amortized  time  on  Find  operations,  where  a(m,  n)  is  an 
inverse  function  of  the  Ackerman  function,  and  m  and  n  respectively  denote  the  total 
number  of  Finds  and  the  total  number  of  Unions  in  the  operation  sequence  [23,32]. 

Let  us  relate  amortized  complexity  to  other  measures  of  data  structure  performance. 
There  exist  data  structure  problems  whose  amortized  complexities  are  lower  than  their 
worst-case  complexities;  for  instance,  the  Union-find  Problem  is  solvable  in  nearly  con- 
stant amortized  time  per  operation  but  requires  nearly  logarithmic  worst-case  time  per 
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operation  [7,15,32].  There  probably  exist  problems  whose  randomized  (or  average-case) 
complexities  are  lower  than  their  amortized  complexities,  but  this  is  yet  to  be  proven;  for 
instance,  the  Dictionary  Problem  and  certain  other  problems  that  involve  testing  equal- 
ity of  objects  appear  to  be  good  candidates  for  this  class.  The  amortized  complexity  of 
a  dynamic  data  structure  problem  is  often  intimately  related  to  the  complexity  of  its 
static  version  via  transformations  of  solutions/adversaries  of  one  problem  into  another 
[6,39]. 

An  appropriate  model  for  proving  lower  bounds  on  the  amortized  complexity  of  a 
data  structure  problem  is  Yao's  cell  probe  model  [38].  This  model  abstracts  out  the 
arithmetic  and  indexing  capabilities  of  random  access  machines  without  ignoring  their 
word-size  limitations.  The  model  has  a  memory  consisting  of  an  array  of  6-bit  locations. 
Data  structure  operations  are  performed  in  a  series  of  memory  probes  in  which  the  next 
probe  location  is  always  computed  as  a  general  function  of  the  values  so  far  read.  In  this 
model,  Fredman  and  Saks  [15]  proved  tight  lower  bounds  on  the  amortized  complexity 
of  many  problems,  including  the  Union-find  Problem.  The  only  other  interesting  lower 
bound  known  in  this  model  is  for  a  static  data  structure  problem,  due  to  Ajtai  [3].  The 
complexity  of  many  other  problems,  notably,  the  Dictionary  Problem  and  the  Priority 
Queue  Problem,  remains  unexplored. 

This  completes  our  introduction  to  amortized  complexity.  Further  information  on 
this  subject  can  be  found  in  [23,32,34]. 

1.2      Our  Work 

This  thesis  investigates  the  amortized  complexity  of  some  fundamental  data  structure 
problems.  We  introduce  interesting  ideas  for  proving  lower  bounds  on  amortized  com- 
plexity and  for  performing  amortized  analysis  that  enable  progress  towards  resolving 
some  open  questions.  The  problems  studied  are  as  follows. 

In  Chapter  2,  we  study  the  amortized  complexity  of  the  Dictionary  Problem.  A  dic- 
tionary is  a  dynamic  set  that  supports  searches,  insertions,  and  deletions  of  elements.  It 
is  an  open  problem  whether  a  dictionary  can  be  maintained  in  constant  amortized  time 
per  operation  using  space  polynomial  in  the  dictionary  size;  we  denote  the  dictionary 
size  by  n.  While  hcishing  schemes  solve  the  problem  in  constant  amortized  time  per  oper- 
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ation  on  the  average  or  using  randomness,  the  best  deterministic  solution  (uses  hashing 
and)  takes  0(logn/loglogn)  amortized  time  per  operation.  We  study  the  deterministic 
complexity  of  the  dictionary  problem  under  a  multilevel  hashing  model  that  is  based  on 
Yao's  cell  probe  model,  and  prove  that  dictionary  operations  require  fi(loglogn)  amor- 
tized time  in  this  model.  Our  model  encompasses  many  known  solutions  to  the  dictionary 
problem,  and  our  result  is  the  first  nontrivial  lower  bound  for  the  problem  in  a  reason- 
ably general  model  that  takes  into  account  the  limited  wordsize  of  memory  locations 
and  realistically  measures  the  cost  of  update  operations.  This  lower  bound  separates  the 
deterministic  and  randomized  complexities  of  the  problem  under  this  model. 

In  Chapter  3,  we  study  a  problem  arising  in  the  analysis  of  Splay,  a  rotation-based  al- 
gorithm for  searching  binary  search  trees  that  was  devised  by  Sleator  and  Tarjan.  Tarjan 
proved  that  Splay  takes  linear  time  to  scan  the  nodes  of  a  binary  tree  in  symmetric  order; 
this  result  is  called  the  Scanning  Theorem.  More  generally,  he  conjectured  that  Splay 
takes  linear  time  to  process  deque  operation  sequences  on  a  binary  tree;  this  conjecture  is 
called  the  Deque  Conjecture.  We  prove  that  Splay  takes  linear-times-inverse-Ackerman 
time  to  process  deque  operation  sequences.  In  the  process,  we  obtain  tight  bounds  on 
some  interesting  combinatorial  problems  involving  rotation  sequences  on  binary  trees. 
These  problems  arose  from  studying  a  conjecture  of  Sleator  that  we  refute.  We  give  two 
new  proofs  of  the  Scanning  Theorem.  One  proof  is  a  potential-based  proof;  Tarjan  had 
originally  posed  the  problem  of  finding  such  a  proof.  The  other  proof  uses  induction. 

In  Chapter  4,  we  study  the  problem  of  maintaining  a  dynamic  collection  of  sets  under 
equality-tests  of  two  sets  and  under  creations  of  new  sets  through  insertions  and  deletions 
of  elements.  We  devise  a  data  structure  that  takes  constant  time  on  equality-tests  and 
logarithmic  amortized  time  on  set-creations.  The  data  structure  derandomizes  a  previous 
randomized  data  structure,  due  to  Pugh  and  Teitelbaum,  that  took  logarithmic  expected 
amortized  time  on  set-creations. 

Some  of  our  work  has  been  published  before.  The  work  on  the  Deque  Conjecture  was 
reported  in  [30].  The  work  on  the  Set  Equality  Problem  was  reported  in  a  joint-paper 
with  Robert  E.  Tarjan  [31]  that  also  dealt  with  other  related  problems. 


Chapter  2 

The  Dictionary  Problem 


The  Dictionary  Problem  is  a  classical  data  structure  problem  arising  in  many  applications 
such  as  symbol  table  management  in  compilers  and  language  processors.  The  problem 
is  to  maintain  a  dynamic  set  under  the  operations  of  search,  insertion,  and  deletion  of 
elements.  The  problem  can  be  solved  in  constant  time  per  operation  using  a  bit  vector 
that  has  a  separate  bit  for  each  element  of  the  universe  indicating  its  prescence  in  the 
set.  This  simple  solution  is  unsatisfactory  since  it  uses  space  proportional  to  the  universe 
size  and  the  dictionary  is  usually  very  small  compared  to  the  universe.  Does  there  exist  a 
constant-amortized-time  solution  that  uses  only  space  polynomial  in  the  dictionary  size? 

In  this  chapter,  we  study  the  amortized  complexity  of  the  dictionary  problem  under 
a  multilevel  hashing  model  that  is  based  on  Yao's  cell  probe  model,  and  prove  that 
dictionary  operations  require  log-logarithmic  amortized  time  in  this  model. 

2.1      Introduction 

The  Dictionary  Problem  is  to  maintain  a  dynamic  set,  called  a  dictionary,  over  a  finite, 
ordered  universe  U  under  the  following  operations: 

•  Search(i):  Determine  whether  element  x  is  in  the  dictionary;  if  so,  find  a  memory 
location  storing  i. 

•  Insert(i):  Add  element  x  to  the  dictionary. 

•  Delete(i):  Remove  element  x  from  the  dictionary. 
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The  dictionary  is  initially  the  empty  set.  We  would  like  to  know  how  fast  the  problem 
can  be  solved  using  only  space  polynomial  in  the  dictionary  size  (denoted  by  n )  on  a 
RAM  with  (loglf'l)-bit  memory  words.  Here  we  are  restricting  the  space  used  in  terms 
of  the  dictionary  size  in  order  to  exclude  the  trivial  bit  vector  solution  and  a  family  of 
its  generalizations,  due  to  Willard  [37],  that  process  dictionary  operations  in  constant 
time  and  use  space  depending  on  the  size  of  the  universe. 

The  Dictionary  Problem  has  been  extensively  studied  and  has  a  rich  variety  of  solu- 
tions. 

Balanced  search  trees  solve  the  problem  in  O(log  n )  time  per  operation  and  use  0(n) 
space  [1,21,23,32].  Self-adjusting  search  trees  such  as  Sleator  and  Tarjan's  Splay  Tree 
[28]  and  Galperin  and  Rivest's  Scapegoat  Tree  []x]  take  O(logn)  amortized  time  per 
operation.  There  is  no  hope  of  improving  the  log  n  behaviour  of  search  trees  since  they 
use  only  comparisons  to  perform  searches. 

Hashing  schemes  solve  the  problem  in  constant  e.xpected  time  per  operation  and 
use  Gin)  space.  Uniform  hashing  performs  dictionary  operations  in  constant  expected 
time  when  the  operations  are  randomly  chosen  from  a  uniform  distribution  [21,1];  the 
space  used  is  0(n).  Universal  hashing  is  an  improved  randomized  hashing  scheme  that 
performs  searches  in  constant  expected  time  and  performs  updates  in  constant  expected 
amortized  time  [8];  the  expectation,  here,  is  over  the  random  hash  functions  chosen 
by  the  algorithm  and  the  bounds  apply  to  all  possible  operation  sequences.  Dynamic 
perfect  hashing  [13]  (see  also  [2,14])  is  a  further  improvement  that  performs  searches  in 
constant  worst-case  time  and  performs  updates  in  constant  expected  amortized  time. 
AH  of  the  above  hashing  schemes  fall  under  the  general  category  of  multilevel  hashing 
schemes.  Roughly  speaking,  a  multilevel  hashing  scheme  consists  of  a  hierarchically 
organized  system  of  hash  functions  that  successively  partition  the  dictionary  into  finer 
pieces  until  all  elements  in  the  dictionary  have  be«>n  separated  plus  an  algorithm  to 
update  the  configuration  after  each  dictionary  op«Tation. 

The  fastest  deterministic  solution  to  the  problem,  at  present,  is  a  multilevel  hashing 
scheme,  due  to  Fredman  and  Willard  [17],  that  takos  0(!ogn/ loglogn)  amortized  time 
per  operation  and  uses  0{n)  space. 

It  has  been  possible  to  prove  nonconstant  lower  bounds  for  the  problem  on  certain  de- 
terministic hashing  models.  Dietzfelbinger  et  al.  [13]  showed  a  tight  bound  of  0(logn)  on 
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the  amortized  complexity  of  dictionary  operations  under  a  wordsize-independent  multi- 
level hashing  model  that  does  not  include  search  trees.  Mehlhorn  et  al.  [24]  strengthened 
this  model  so  that  it  includes  search  trees  and  showed  that  dictionary  operations  have 
0(loglogn)  amortized  complexity  under  their  model.  These  models  assume  that  mem- 
ory locations  have  an  unbounded  wordsize  and  overestimate  the  costs  of  operations;  this 
simplifies  the  task  of  proving  lower  bounds  but  the  models  are  not  realistic.  If  memory 
locations  have  a  sufficiently  large  wordsize,  the  problem  is  solvable  in  constant  time  per 
operation  (Paul  and  Simon  [25];  Ajtcii  et  al.  [4]).  When  memory  locations  can  only  rep- 
resent a  single  element  of  the  universe,  however,  the  best  available  solution  is  Fredman 
and  Willard's  O(logn/ loglogn)-time  solution  [17]. 

We  prove  a  nonconstant  lower  bound  for  the  Dictionary  Problem  under  a  multilevel 
hashing  model,  based  on  Yao's  cell  probe  model,  that  takes  into  account  the  limited 
wordsize  of  memory  locations  and  realistically  measures  the  costs  of  operations. 

We  define  a  generic  multilevel  hashing  model  for  solving  the  Dictionary  Problem 
from  which  various  lower  bound  models  for  the  problem  can  be  derived  by  specifying 
suitable  cost  measures  for  the  operations.  The  model  has  a  vertically  organized  memory 
that  consists  of  a  root  location  at  level  1  and  an  array  of  at  most  m  locations  at  level  i, 
for  each  i  >  2.  Memory  locations  have  b  bits  each,  for  some  b  >  \og\U\.  Each  memory 
location  stores  some  c  different  elements  of  the  universe  plus  a  (6  -  clog|f/|)-bit  value 
that  guides  search  operations;  here  the  number  of  elements  stored  at  a  location  varies 
over  time,  and  is  not  constant.  A  search  operation  locates  an  element  in  memory  by 
performing  a  sequence  of  memory  probes:  the  first  probe  of  the  sequence  always  reads 
the  root  location  and  the  i-th  probe,  for  i  >  2,  reads  a  location  at  level  i  that  is  com- 
puted as  a  general  function  of  the  sequence  of  z  —  I  values  so  far  read  and  the  element 
being  searched.  The  operation  either  locates  the  element  in  some  location  after  a  series 
of  probes  or  concludes  that  the  element  is  not  in  the  dictionary  after  a  series  of  unsuc- 
cessful probes.  A  search  operation  finally  replaces  the  current  memory  configuration  by 
a  new  configuration,  representing  the  same  dictionary,  by  computing  a  general  function 
of  the  current  configuration.  An  update  operation  simply  replaces  the  current  memory 
configuration  by  a  new  configuration  that  represents  the  new  dictionary  by  computing  a 
general  function  of  the  current  configuration. 
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We  define  a  lower  bound  model  for  the  problem  by  imposing  a  measure  of  operation 
costs  on  the  generic  model.  The  Hamming  cost  of  an  operation  is  defined  as  the  maximum 
number  of  locations  at  any  level  whose  contents  change  due  to  the  operation.  The  cost 
of  a  search  operation  is  defined  as  the  number  of  read  probes  performed  by  the  operation 
(called  the  search  cost  of  the  operation)  plus  the  Hamming  cost  of  the  operation.  The 
cost  of  an  update  operation  is  defined  as  its  Hamming  cost.  When  this  cost  measure 
is  imposed  on  the  generic  model  we  get  a  lower  bound  model  called  the  Hamming  cost 
model. 

We  prove  our  lower  bound  in  the  following  model  that  has  a  different  cost  measure 
from  the  Hamming  cost  model.  The  search  path  of  an  element  x  of  (/  in  a  memory 
configuration  C  is  defined  as  the  sequence  of  locations  probed  by  a  search  operation  on 
X  performed  in  configuration  C.  We  say  that  an  operation  refreshes  a  memory  location 
/  if  there  is  some  element  x  in  the  final  dictionary  such  that  /  lies  on  the  search  path 
of  X  after  the  operation  but  /  did  not  lie  on  the  search  path  of  x  before  the  operation. 
The  refresh  cost  of  an  operation  is  defined  as  the  maximum  number  of  locations  at  any 
level  that  get  refreshed  by  the  operation.  Define  the  cost  of  an  operation  as  the  sum  of 
the  search  cost  and  the  refresh  cost  of  the  operation.  The  lower  bound  model  with  this 
cost  measure  is  called  the  refresh  cost  model.  The  difference  between  this  model  and 
the  Hamming  cost  model  is  that  the  refresh  cost  measure  is  sensitive  to  locations  that 
participate  in  rerouting  the  search  paths  of  dictionary  elements  during  an  operation, 
on  the  other  hand,  the  Hamming  cost  measure  is  sensitive  to  locations  whose  contents 
change. 

A  nonconstant  lower  bound  for  the  Dictionary  Problem  in  the  Hamming  cost  model 
would  translate  into  a  similar  lower  bound  for  the  problem  in  the  cell  probe  model.  We 
believe  that  such  a  lower  bound  exists,  but  we  can  only  prove  a  lower  bound  under 
the  refresh  cost  model.  The  refresh  cost  model  seems  to  be  incomparable  in  power  to 
the  cell  probe  model  and  to  the  Hamming  cost  model.  The  justification  for  the  refresh 
cost  model  is  that,  in  realistic  hashing  schemes,  an  operation  might  have  to  read,  and, 
if  necessary,  modify,  the  locations  it  refreshes  in  order  to  correctly  reroute  the  search 
paths  of  dictionary  elements.  The  refresh  cost  model  encompasses  many  of  the  known 
solutions  to  the  dictionary  problem,  but  not  all  of  them;  for  instance,  the  model  includes 
B-trees,  red-black  trees,  and  various  hashing  schemes,  but  the  clatss  of  rotations- based 
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search  trees  is  not  included. 

Our  lower  bound  is  given  by: 

Theorem  2.1  Consider  the  refresh  cost  multilevel  hashing  model  with  at  most  m  mem- 
ory locations  per  level,  a  universe  U,  and  a  wordsize  b.  Let  n  be  a  positive  integer 
satisfying  \U\  >  m'°8 '<>«".  Consider  dictionary  operation  sequences  during  which  the 
maximum  dictionary  size  is  n.  Under  this  model,  the  amortized  complexity  of  dictionary 
operations  equals  n(log(Iogn/ log6)). 

In  a  typical  situation  where  \U\  =  m'°8'°8",  m  =  poly(n),  and  6  =  log|f/|,  the  theorem 
yields  a  lower  bound  of  Q(loglog7z)  on  the  amortized  complexity  of  dictionary  oper- 
ations. This  lower  bound  separates  the  deterministic  and  randomized  complexities  of 
hashing  schemes  in  the  refresh  cost  model,  since  this  model  encompasses  randomized 
hashing  schemes  like  universal  hashing  [8]  and  dynamic  perfect  hashing  [13]  that  process 
dictionary  operations  in  constant  randomized  amortized  time. 

The  proof  technique  of  the  theorem  can  be  used  to  show  that  single-level  hashing 
schemes  in  the  Hamming  cost  model  require  J7(n")  amortized  time  to  process  dictionary 
operations,  for  some  constant  a.  We  hope  that  the  proof  technique  will  be  helpful  in 
showing  a  general  nonconstant  lower  bound  for  the  dictionary  problem  in  the  Hamming 
cost  model. 

The  chapter  is  organized  as  follows.  In  Section  2.2,  we  introduce  the  basic  ideas 
needed  to  prove  Theorem  2.1  by  proving  a  nonconstant  lower  bound  under  the  simpler 
model  of  single-level  hashing.  In  Section  2.3,  we  prove  the  theorem. 

2.2      Single-level  Hashing  Model 

In  this  section,  we  prove  a  nearly  linear  lower  bound  for  the  Dictionary  Problem  under 
the  refresh  cost  single-level  hashing  model. 

We  define  the  lower  bound  model.  The  model  consists  of  an  array  of  m  locations  each 
capable  of  storing  an  element  of  U,  a  family  of  at  most  2*  hash  functions  from  U  to  the 
array,  and  a  6-bit  root  location  storing  a  hash  function  from  the  family.  The  root  hash 
function  is  always  chosen  so  that  it  sends  the  elements  of  the  dictionary  into  different 
locations,  and  collisions  of  elements  are  forbidden;  in  general,  when  a  hash  function  sends 


10  CHAPTER  2.    THE  DICTIONARY  PROBLEM  ' 

the  elements  of  a  set  into  distinct  locations  it  is  called  a  perfect  hash  function  [14]  for  the 
set  and  it  is  said  to  shatter  the  set.  Search  operations  are  performed  in  two  probes:  the 
first  probe  reads  the  hash  function  from  the  root  location;  the  second  probe  checks  if  the 
element  is  present  in  the  array  location  where  it  is  sent  by  the  hash  function.  An  update 
operation  can  change  the  root  hash  function  and  can  write  into  some  array  locations  in 
order  to  appropriately  relocate  the  dictionary  elements.  The  cost  of  a  search  operation 
is  2  units.  The  cost  of  an  update  operation  is  its  refresh  cost,  which  equals  the  number 
of  dictionary  elements  that  are  relocated  during  the  operation. 

The  lower  bound  for  single-level  hashing  is  given  by: 

Theorem  2.2  Consider  the  refresh  cost  single-level  hashing  model  with  an  array  of  m 
memory  locations,   a  universe  U,   and  a  b-bit  root  location.     Assume  that  \U\   >   '2m.  \ 

Consider  dictionary  operation  sequences  during  which  the  maximum  dictionary  size  is 
at  most  n,    where  n   <   m.     Under  this  model,   the  amortized  complexity  of  dictionary  j 

operations  equals  ^{n/b). 

A  typical  situation  to  apply  this  theorem  is  when  b  =  \og\U\  and  m  and  |[/|  are  polyno- 
mial in  n.  Under  these  assumptions,  the  theorem  yields  a  lower  bound  of  Q(n/  logn)  on 
the  amortized  complexity  of  dictionary  operations  under  the  single-level  hashing  model. 

The  main  idea  behind  the  proof  of  the  theorem  is  an  adversary  who  keeps  creating 
random  collisions  in  U.  Any  hashing  scheme  with  a  small  hash  functions  family  can  not 
succeed  against  this  adversary.  The  proof  uses  the  following  sampling  lemma,  due  to 
Hoeffding  [20],  that  a  random  sample  closely  behaves  like  the  whole  population. 

Lemma  2.1  (Binary  Sampling  Lemma)  Let  k  >  1  andletO  <  /?  <  a  <  1.  Consider 
a  population  of  at  least  k  elements  of  which  some  a-fraction  of  the  elements  are  colored 
red.  A  random  k-subset  of  the  population  has  more  than  (3k  red  elements  with  probability 
at  least  (1  -  e"*^"'^*),  where 


1  2(a- 


-/3)V(2q(1-q))     ifQ-0<l-a 


1    o/^^      py  otherwise. 

Let  us  prove  the  theorem.  Denote  the  amortized  cost  per  update  operation  incurred 
by  a  hashing  scheme  by  w.    We  prove  that  e"*"/*")  hash  functions  are  needed  by  the 
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hashing  scheme  in  order  to  successfully  process  all  sequences  of  0(n)  insertions.  This 
would  imply  the  theorem.  The  proof  of  this  result  is  organized  in  a  series  of  steps. 
A  hash  function  is  called  uniform  if  it  sends  the  same  number  of  elements  into  each 
location.  In  Section  2.2.1,  we  prove  the  result  for  the  case  when  the  hash  functions  used 
by  the  hashing  scheme  are  all  uniform  and  the  update  cost  is  bounded  in  the  worst- 
case.  In  Section  2.2.2,  we  handle  nonuniform  hash  functions.  In  Section  2.2.3,  we  handle 
amortized  update  costs. 

2.2.1      Uniform  Hash  Functions  and  Worst-case  Complexity 

In  this  section,  we  prove  that  any  single-level  hashing  scheme  in  the  refresh  cost  model 
that  uses  only  uniform  hash  functions  and  has  worst-case  update  cost  w  needs  at  least 
e^ln/u;)  jjg^gjj  functions  in  order  to  handle  all  sequences  of  0(n)  insertions. 

The  adversary  performs  two  batches  of  insertions:  a  large  batch  followed  by  a  small 
batch.  The  large  batch  is  a  random  n-subset  of  T.  Let  h  denote  the  hash  function  used 
after  the  large  batch;  /i  is  a  random  variable.  The  small  batch  is  constructed  as  follows: 
Randomly  select  k  =  n/cw  locations,  where  c  is  a  constant.  For  each  selected  location 
pick  a  random  pair  of  elements  of  U  that  h  sends  into  that  location.  The  small  batch 
is  the  union  of  these  pairs.  Since  we  assumed  that  \U\  >  2m  and  that  h  is  uniform,  h 
sends  at  least  two  elements  of  U  into  each  location. 

The  following  lemma  gives  the  lower  bound  on  the  size  of  the  hash  functions  family. 

Lemma  2.2  Any  hashing  scheme  requires  e^(*'  ha.^h  functions  in  order  to  succeed  against 
the  adversary. 

The  idea  behind  the  proof  of  this  lemma  is  thai  any  fixed  pair  of  hash  functions  (h,g) 
that  are  respectively  used  after  the  two  batches  has  a  low  probability  of  success  against 
the  adversary.  For  if  h  and  g  are  sufficiently  similar  then  g  can  not  shatter  the  small 
batch.  On  the  other  hand  if  h  and  g  are  sufficiently  dissimilar  then  too  many  elements 
in  the  large  batch  will  change  locations  during  the  >niall  batch. 

Proof  of  Lemma  2.2.  Consider  a  pair  of  ha.sh  functions  (h,g)  that  are  respectively 
used  after  the  two  batches.  We  compute  the  probability  of  success  of  the  hashing  scheme 
against  the  adversary  using  this  pair  of  hash  functions.  Define  S{h,g)  =  the  number  of 
elements  of  U  in  which  h  and  g  differ.  Two  cases  have  to  be  considered: 
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Case  1.  S(h,g)  >  \U\/8:  By  the  Binary  Sampling  Lemma,  the  large  batch  has  more 
than  n/16  elements  that  differ  in  h  and  g  with  probability  at  least  (1  -  e"""'/* '/'«" ). 
Thus  the  update  cost  incurred  by  (h.g)  during  the  small  batch  exceeds  n/16  with  this 
probability.  Setting  k  =  n/3'2u.\  the  maximum  update  cost  allowed  during  the  small 
batch  equals  n/16.  Hence  PT[(h,g)  succeeds]  <  e-'^i/^i/ie". 

Case  2.  S{h,g)  <  \U\/8:  A  location  is  called  similar  if  g  sends  into  that  location 
greater  than  (l/2)-fraction  of  the  elements  that  h  sends  into  it.  At  least  (.3/4)-fraction 
of  the  locations  are  similar  locations,  since  h  and  g  differ  in  at  most  (l/8)-fraction  of  U . 
By  the  Binary  Sampling  Lemma,  the  random  set  of  k  locations  selected  during  the  small 
batch  contains  more  than  k/2  similar  locations  with  probability  at  least  (1  -  e-'^3/4,i/2':) 

Consider  a  set  of  k  locations,  comprising  at  least  k/2  similar  locations,  that  are  used 
to  construct  the  small  batch,  g  shatters  a  random  pair  of  elements  sent  by  h  into  a 
similar  location  with  probability  at  most  3/4.  Hence  g  shatters  a  random  small  batch 
constructed  using  the  above  locations  with  probability  at  most  (3/4)'^/^. 

Combining  the  above  calculations,  Pr[(/i,g)  succeeds]  <  e"*^'/*'/^*^  +  (3/4)''/'^. 

We  are  ready  to  compute  the  constants. 

Ci/8.1/16  =  1/56;    C3/4,i/2  =  1/6;    (l/2)ln(4/3)  >  1/7. 

For  sufficiently  large  k,  the  success  probability  of  (/i,  5)  is  at  most  e"*^/®.  It  follows  that  at 
least  e'''^^  hash  functions  are  needed  by  the  hashing  scheme  in  order  to  succeed  against 
the  adversary.    D 

2.2.2      Nonuniform  Hash  Functions 

In  this  section,  we  extend  the  lower  bound  of  the  previous  section  to  families  of  nonuni- 
form hash  functions. 

The  adversary  once  again  inserts  a  large  batch  which  is  a  random  n-subset  of  U  and 
then  inserts  a  small  batch.  The  small  batch  is  constructed  based  on  the  hash  function 
h  that  is  used  by  the  hashing  scheme  after  the  large  batch.  A  multiple  location  of  /i  is  a 
location  into  which  h  sends  two  or  more  elements  of  U .  Focus  on  the  subuni verse  U  of 
elements  of  U  that  h  sends  into  its  multiple  locations.  Select  a  random  it-subset  of  U, 
for  k  =  n/cw.  For  each  element  of  the  subset  select  a  random  element  of  U  that  collides 
with  it  under  h.  The  small  batch  consists  of  the  subset  and  the  elements  selected. 
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Let  us  see  why  Lemma  2.2  still  holds  for  this  adversary.  Once  again  the  idea  is  to 
show  that  any  fixed  pair  of  hash  functions  (h.g)  has  a  low  probability  of  success.  The 
case  when  S{h,g)  >  \U\/S  is  handled  as  before.  Consider  the  case  S(h.g)  <  \U\/S.  Since 
\U\  >  \U\/'2,  h  and  g  differ  in  at  most  (l/4)-fraction  of  U.  An  element  of  V  that  is 
sent  by  both  h  and  g  into  a  similar  location  is  called  a  similar  element.  At  least  (1/4)- 
fraction  of  U  are  similar  elements.  By  the  Binary  Sampling  Lemma,  we  can  expect  at 
least  A;/8  elements  of  the  A;-subset  from  which  the  small  batch  is  constructed  to  be  similar 
elements.  Now  if  two  of  these  similar  elements  collide  under  h  then  g  fails  to  shatter  the 
small  batch.  So  suppose  that  h  and  g  send  all  the  similar  elements  of  the  small  batch 
into  different  locations.  In  this  case,  as  in  the  previous  section,  it  is  easy  to  see  that  g 
fails  to  shatter  the  small  batch  with  probability  greater  than  (1/2)'^/*.  This  completes 
the  second  case  and  the  proof  that  e^'*^'  hash  functions  are  needed  to  succeed  against 
the  adversary. 

2.2.3     Amortization 

In  this  section,  we  incorporate  amortization  into  the  previous  section's  argument.  Let 
w  denote  the  amortized  cost  per  update  operation  incurred  by  the  hashing  scheme. 
The  adversary  of  the  previous  section  is  modified  by  performing  a  greedy  sequence  of 
insertions  between  the  two  batches.  Immediately  after  the  large  batch,  the  adversary 
performs  a  maximal  sequence  of  insertions  a  such  that  a  incurs  an  update  cost  of  more 
than  2uj|<t|.  Then  the  small  batch  is  performed  as  before  based  on  the  hash  function  h 
used  after  a.  Observe  that  \a\  is  at  most  n,  so  only  0{n)  insertions  are  totally  performed. 
The  majcimality  of  a  ensures  that  the  total  update  cost  incurred  during  the  small  batch 
is  at  most  Awk.  Hence  we  can  apply  the  argument  of  the  previous  section  to  obtain  a 
lower  bound  on  the  hash  functions  family. 

2.3      Multilevel  Hashing  Model 

In  this  section,  we  prove  a  log-logarithmic  lower  bound  for  the  Dictionary  Problem  under 
the  refresh  cost  multilevel  hashing  model. 

The  main  idea  behind  the  proof  is  a  randomized  adversary  who  alternately  performs 
greedy  searches  of  the  dictionary  elements  and  creates  random  collisions  at  the  various 
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levels  of  the  multilevel  hashing  scheme,  descending  the  levels  of  the  scheme.  The  colli- 
sions force  the  scheme  to  either  incur  a  large  cost  on  searches  or  incur  a  large  cost  in 
compressing  the  dictionary  to  few  levels.  The  collisions  are  created  in  batches  of  inser- 
tions. Each  batch  is  constructed  by  selecting  a  set  of  locations  from  the  first  i  levels,  by 
focusing  on  the  subuniverse  of  elements  of  U  whose  search  paths  involve  just  these  loca- 
tions during  the  first  i  probes,  and  by  picking  a  random  subset  from  the  subuniverse  of 
appropriate  size;  the  subset  is  always  chosen  to  be  several  times  larger  than  the  number 
of  selected  locations  in  order  to  create  collisions. 

We  sketch  the  lower  bound  proof.  Let  us  confine  ourselves  to  worst-case  complexity 
since  amortized  complexity  can  be  easily  handled,  as  in  the  single-level  hashing  lower 
bound,  by  appropriately  performing  greedy  searches  and  greedy  insertions  during  the 
adversary  sequence.  The  adversary  is  defined  recursively  on  levels,  and  the  proof  has 
an  inductive  structure  that  is  reminiscent  of  a  previous  lower  bound  for  a  static  data 
structure  problem  in  the  cell  probe  model,  due  to  Ajtai  [3].  In  order  to  carry  out  the 
induction,  we  need  to  prove  a  stronger  result  that  applies  to  a  more  general  hashing 
model,  called  the  partial  hashing  model.  In  a  partial  hashing  scheme,  each  value  of 
the  root  location  determines  a  working  subuniverse  of  U  on  which  the  scheme  supports 
dictionary  operations.  The  lower  bound  applies  to  partial  hashing  schemes  that  work, 
occcLsionally,  on  a  dense  subuniverse  of  f/;  we  show  that  such  schemes  fail,  almost  surely, 
against  the  adversary. 

The  main  feature  of  the  proof  is  the  handling  of  nonuniform  root  hash  functions.  The 
value  stored  at  the  root  location  of  the  hashing  scheme  defines  a  partial  hash  function 
which  is  essentially  the  second  probe  function  used  by  the  scheme  to  perform  searches. 
We  say  that  this  hash  function  is  narrow  if  it  sends  a  constant  fraction  of  the  universe 
into  a  small  set  of  level-2  locations;  otherwise  the  hash  function  is  said  to  be  wide. 
The  adversary  first  recursively  performs  a  narrow  phase  of  insertion  batches  in  U  that 
force  the  scheme  to  use  narrow  ha^h  functions  and  then  recursively  performs  a  wide 
phase  of  much  smaller  insertion  batches  within  a  random  subuniverse  U  of  U.  The  wide 
phzise  is  not  always  performed;  it  is  performed  only  if  the  narrow  phase  gets  prematurely 
truncated  because  the  hashing  scheme  has  stopped  using  narrow  hash  functions.  The 
adversary  consists  of  two  different  phases  because  the  hashing  scheme  behaves  differently 
depending  on  whether  it  mostly  uses  narrow  hash  functions  or  it  mostly  uses  wide  hash 
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functions  and  two  phases  are  needed  to  fool  all  hashing  schemes.  We  show  that  the 
probability  of  success  of  the  hashing  scheme  through  the  completion  of  either  of  the 
phases  is  small.  We  analyze  the  narrow  phase  using  induction:  we  analyze  the  wide 
phase  by  showing  that  fixed  sequences  of  root  hash  functions  used  during  this  phase 
have  a  low  success  probability,  using  some  random  sampling  lemmas  and  induction. 

The  proof  of  the  lower  bound  is  organized  as  follows.  In  Section  2.3.1,  we  describe 
the  partial  hashing  model.  In  Section  2.3.2,  we  describe  the  adversary  used  for  proving  a 
lower  bound  on  worst-case  complexity.  In  Section  2.3.3,  we  state  and  prove  two  technical 
random  sampling  lemmas  needed  to  analyze  the  wide  phase.  In  Section  2.3.4,  we  prove 
the  worst-case  lower  bound.  In  Section  2.3.5,  we  handle  amortized  complexity. 

2.3.1      Partial  Hashing  Model 

In  this  section,  we  describe  the  partial  hashing  model. 

A  partial  hashing  scheme  is  a  hashing  scheme  that  processes  dictionary  operations 
in  a  tower  of  universes  U\  "D  U2  ^  •  ■  ■  ^  Uk-  Uk  '^^  called  the  working  universe  of  the 
scheme.  The  scheme  has  a  vertically  organized  memory  consisting  of  a  {Wb)-h\t  root 
location  and  an  infinite  word-size  advice  location  at  level- 1,  and  an  array  of  at  most  m 
6- bit  locations  at  each  level  i  >  2;  we  require  that  b  >  log|[/i|  so  that  a  location  can 
store  any  element  of  the  universes.  Each  possible  value  v  of  the  root  location  defines  a 
tower  of  subuniverses  Si{v)  D  S2(v)  2  •  •  O  Sk{v)  such  that  S,(v)  C  U,  for  all  i.  Sk{v) 
is  called  the  working  subuniverse  corresponding  to  value  v.  A  search  operation  starts 
by  probing  the  root  location;  if  the  element  being  searched  is  in  the  current  working 
subuniverse,  then  the  search  proceeds  downwards  as  in  the  standard  multilevel  hashing 
model;  if  the  element  is  not  in  the  current  working  subuniverse,  then  the  search  operation 
probes  the  advice  location  and  continues  downwards  in  the  standard  fashion.  An  update 
operation  changes  the  memory  configuration;  a  search  operation  can  also  change  the 
memory  configuration.  The  search  cost  of  an  operation  is  defined  as  the  number  of  read 
probes  performed  by  the  operation,  not  counting  probes  on  the  advice  location.  The 
refresh  cost  of  an  operation  is  a  suitably  defined  number  that  is  at  least  the  maximum 
number  of  locations  at  any  level  that  get  refreshed  by  the  operation.  The  cost  of  an 
operation  is  defined  to  be  the  sum  of  its  search  cost  and  its  refresh  cost.  This  completes 
the  description  of  the  model. 
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In  order  to  prove  a  lower  bound  for  partial  hashing  schemes,  we  have  to  require  these 
schemes  to  use  dense  subuniverses.  at  least  occasionally.  The  density  of  a  set  5  C  7  in 
T  equals  |5|/|r|;  S  is  said  to  be  a-dense  in  T  if  5  has  a  density  of  at  least  o  in  T.  Let 
Q  =  (ai,a2, . .  .,QA:)  be  a  vector  of  values  in  the  interval  [0.  1],  let  e  be  a  positive  integer, 
and  let  H  he  a.  partial  hashing  scheme.  A  configuration  C  of  i/  is  said  to  be  a-dense 
if  the  subuniverses  Si(v)  5  52(i')  2  •  •  ■  3  Sk(v)  corresponding  to  configuration  C  have 
respective  densities  >  qj ,  Q2,  . . . , Qfc  in  universes  U\ ,U2,---,Uk-  A  configuration  C  of  // 
is  said  to  be  {a.e)-good  if  there  is  an  extension  sequence  of  at  most  e  insertions  in  Ui, 
starting  from  C,  after  which  /f  is  in  a  a-dense  configuration;  otherwise  C  is  said  to  be 
{a,e)-bad.  H  is  said  to  be  [a,e)-good  if  H  always  uses  (Q.e)-good  configurations.  Our 
lower  bound  applies  to  partial  hashing  schemes  that  are  (Q.c)-good. 

2.3.2      Adversary 

In  this  section,  we  describe  a  randomized  adversary  for  proving  a  lower  bound  on  the 
worst-case  complexity  of  good  partial  hashing  schemes. 

We  define  an  adversary  A^  „  ^  (,  against  a  partial  hashing  scheme  H  with  a  tower  of 
universes  Ui  'D  U2  ^  ■  ■  •  Q  Uk  and  a  working  universe  ('  =  Uk  that  has  been  partitioned 
into  n  equal-sized  blocks;  here  a  =  (qi,Q2,  . . .  ,q/c)-  The  adversary  is  tailored  against 
(a,e)-good  partial  hashing  schemes  with  worst-case  search  cost  r  and  worst-case  update 
cost  w,  but  it  is  defined  against  any  partial  hashing  scheme;  we  will  define  e  later. 
The  adversary  either  performs  a  complete  sequoiicp  of  0{n)  insertions  on  the  initial 
dictionary  leaving  H  in  a  d-dense  configuration  or  performs  a  truncated  sequence  of 
operations  which  could  not  be  completed  becau.se  the  hashing  scheme  has  entered  an 
(Q,e)-bad  configuration. 

We  need  some  definitions.  Let  p  be  a  positive  integer,  let  e  G  [0, 1],  and  let  /i  be  a 
partial  hash  function  from  a  universe  U  to  an  array  of  locations.  The  domain  of  h  is 
denoted  by  dom(/i).  A  random  {€,p)-sample  from  /i  is  a  random  subset  R  of  dom(/i) 
constructed  as  follows.  First  delete  from  dom(/i)  all  the  elements  that  go  into  locations 
where  h  sends  less  than  an  (ep)-fraction  of  U :  then  pick  a  random  p-subset  S  from 
dom(/i);  for  each  location  into  which  h  sends  k  elements  of  S.  pick  a  random  subset  with 
density  ek  in  U  that  h  sends  into  that  location:  R  is  the  union  of  the  subsets  picked 
from  the  various  locations.  A  partial  hash  function  is  said  to  be  (a,  W)-narrow,  for  some 
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Q  e  [0, 1]  and  some  positive  integer  W,  if  it  sends  at  least  an  o-fraction  of  t)  into  some 
set  of  W  locations;  otherwise  the  hash  function  is  said  to  be  (q,  W)-unde.  A  partial  hash 
function  is  said  to  be  (i-hiased,  for  some  3  e  [0. 1].  if  it  sends  at  most  a  J-fraction  of  U 
into  any  single  location.  We  can  prune  a  partial  hash  function  and  make  it  /3-biased.  for 
any  given  /3  G  [0, 1],  by  deleting  sufficiently  many  elements  from  its  domain. 

We  are  ready  to  define  the  adversary.  The  adversary  first  inserts  a  random  n-subset 
of  U  that  is  formed  by  picking  a  random  element  from  each  block  of  U  (called  the  first 
batch)  and  then  performs  a  phase  of  insertions,  called  the  tail  phase.  The  tail  phase  is 
defined  recursively  on  r: 

Basis,  r  =  1:  If  there  exists  an  extension  sequence  a  consisting  of  at  most  e  =  n  in- 
sertions in  Ui  taking  H  to  a.  Q-dense  configuration,  perform  ct  and  announce  completion; 
otherwise  ^  is  in  a  (d,e)-bad  configuration,  so  announce  truncation. 

Induction  Step,  r  >  2:  We  first  perform  a  recursive  subphase,  called  the  narrow 
phase;  if  this  phase  gets  truncated  we  perform  another  recursive  subphase,  called  the 
wide  phase.  The  adversary  announces  completion  if  one  of  the  two  phases  completes; 
otherwise  the  adversary  announces  truncation.  The  two  phases  are  performed  as  follows: 

Narrow  Phase:  Let  W^  be  a  suitably  defined  positive  integer.  We  construct  a  new 
hashing  scheme  Hi  from  H,  having  a  tower  of  universes  Ui  D  U2  2  ■  ■  ■ '2  Uk  =  Uk+i,  by 
collapsing  the  top  two  levels  of  H  into  a  single  level.  Each  possible  value  stored  at  the 
root  location  of  H  defines  a  partial  hash  function  from  Uk  to  the  set  of  level-2  locations 
which  equals  the  second  probe  function  used  by  search  operations  when  that  value  is 
stored  at  the  root.  We  rank  the  level-2  locations  of  H  according  to  each  partial  hash 
function  h  used  by  H  at  its  root  location  as  follows.  The  i-th  location  of  a  partial  hash 
function  h  from  a  domain  D  C  U  to  the  set  of  level-2  locations  is  defined  to  be  the 
location  into  which  h  sends  the  i-th  largest  subset  of  D\  we  resolve  ties  in  favor  of  the 
location  with  the  smallest  index.  The  root  location  of  ^1,  at  any  time,  consists  of  the 
root  location  of  H  juxtaposed  with  the  first  Wi/2  level-2  locations  of  the  current  root 
partial  hash  function  (this  set  of  locations  is  denoted  by  Ii);  the  advice  location  of  Hi 
consists  of  the  advice  location  of  H  juxtaposed  with  the  remaining  level-2  locations  of 
H\  the  i-th  level  of  Hu  for  i  >  2,  coincides  with  the  (i  -I-  l)-st  level  of  H.  The  working 
subuniverse  of  Hi,  at  any  time,  is  the  subset  of  the  working  subuniverse  of  H  that  the 
root  hash  function  of  H  maps  into  the  set  of  locations  Li;  the  tower  of  subuniverses  of 
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Hi.  at  any  time,  consists  of  the  tower  of  subuniverses  of  H  followed  by  H\'s  working 
subuniverse.  Operations  are  performed  by  H^  by  simulating  the  behaviour  of  H .  The 
search  cost  of  an  operation  on  Hi  is  defined  in  the  usual  way.  The  refresh  cost  of  an 
operation  on  Hi  is  defined  equal  to  the  refresh  cost  of  that  operation  on  H .  Notice  that 
this  definition  of  refresh  cost  is  more  liberal  than  the  usual  definition  of  the  refresh  cost 
as  the  maximum  number  of  locations  at  any  level  that  get  refreshed. 

The  narrow  phase  of  the  adversary  Al^^^^^j,  against  H  is  recursively  defined  to  be  the 
tail  phase  of  the  adversary  A^^^  ,  against  Hi,  starting  from  the  same  configuration  as 
H's  configuration  prior  to  the  narrow  phase;  here  0i  =  (qi,Q2,  -  -  .,afc,aA:/2).  The  nar- 
row phase  completes  if  the  recursive  adversary  announces  completion;  otherwise  the  nar- 
row phase  is  truncated.  The  narrow  phase  always  forces  H  to  use  (Qfc/2.  Wi/2)-narrow 
root  hash  functions:  if  the  phase  gets  truncated  then  H  will  always  use  (ak/2,Wi/2)- 
wide  root  hash  functions  in  any  Q-dense  configuration  during  the  next  Cj  insertions  in 
Ui;  we  will  specify  ei  later. 

Wide  Phase:  The  wide  phase  consists  of  two  subphases:  the  first  phase  incrementally 
constructs  a  random  subuniverse  Uk+\  C  U  and  alternately  performs  random  insertion 
batches  in  Uk+i  and  extension  batches  in  Ui;  the  second  phase  recursively  performs 
further  random  insertion  batches  in  Uk+i  and  extension  batches  in  Ui.  The  wide  phase 
completes  either  if  the  second  phase  completes  or  if  the  second  phase  gets  truncated  but 
H  is  in  an  (Q,e)-good  configuration  following  the  phase  (in  the  latter  case,  the  wide 
phase  is  completed  by  performing  a  suitable  extension  sequence  in  Ui  that  takes  /f  to  a 
Q-dense  configuration);  otherwise  the  wide  phase  is  truncated.  We  can  prune  every  root 
hash  function  used  by  /f  in  a  d-dense  configuration  during  the  wide  phase  by  deleting 
from  its  domain  a  subset  of  density  at  most  (Qfc/2)  in  U  and  make  it  (ak/Wi)-h\a£ed 
since  these  hash  functions  are  {ak/2,Wi/2)-w\de.  Let  q  =  ajt,  let  W2  and  62  be  suitably 
chosen  positive  integers,  and  let  e  =  (0^/641^2"^)-  The  two  subphases  are  performed  as 
follows: 

First  Phase:  We  maintain  a  pair  of  sets  U^  and  U"^  C  [/'  that  are  initially  the  empty 
sets  and  repeat  the  following  step  as  many  times  as  possible: 

Step:  If  there  is  an  extension  sequence  &  of  at  most  62  insertions  in  Ui  following 
which  /T  is  in  a  d-dense  configuration  and  H's  pruned  root  hash  function  has  a  domain 
D  such  that  D\U^  is  (Q/4)-dense  in  t/,  then  perform  a  and  continue  the  step;  otherwise 
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terminate  the  step.  This  ensures  that,  in  any  Q-dense  configuration  of  H  during  the  next 
€2  insertions  in  U\  following  the  first  phase,  the  domain  of  H's  pruned  root  hash  function 
will  always  intersect  U^  in  a  (a/4)-dense  subset  of  [/.  Let  h  be  the  pruned  root  hash 
function  used  by  H  following  a,  and  let  D'  be  a  domain  of  density  (q/4)  in  U  that  is 
disjoint  to  U^  and  on  which  h  is  defined.  We  insert  D'  into  [/^  pick  a  random  (f,  1^2/2)- 
sample  U^  from  /i|d'  and  insert  it  into  U2,  and,  finally,  insert  a  random  (Qe2/8)-subset 
5  of  U^  into  the  dictionary.  The  set  5  is  constructed  by  uniformly  partitioning  U^  into 
(062/8)  equal  blocks  in  any  way  and  picking  a  random  element  from  each  block.  This 
completes  the  description  of  the  step. 

Second  Phase:  Let  Uk+i  =  the  value  of  set  U'^  at  the  end  of  the  first  phase,  let  I2 
=  the  set  of  level-2  locations  from  which  the  samples  U"^  were  constructed  during  the 
steps  of  the  first  phase,  and  let  n'  =  the  total  number  of  random  elements  inserted  into 
the  dictionary  during  the  first  phase.  The  second  phase  is  performed  only  if  Uk+i  ^  4>\ 
otherwise  the  second  phase  is  truncated.  We  construct  a  new  hashing  scheme  H2  having 
a  tower  of  universes  Ui  D  U2  ^  ■  ■  ■  '^  Uk  '^  Uk+i  by  collapsing  the  top  two  levels  of  H, 
by  appending  the  set  of  locations  L2  to  the  root  location  of  H,  and  by  appending  the 
remaining  level-2  locations  to  the  advice  location  of  H .  The  working  subuniverse  of  H2, 
at  any  time,  is  the  subset  of  Uk+\  that  /T's  pruned  root  hash  function  sends  into  the 
locations  of  Z2.  Operations  on  H2  are  performed  by  simulating  ^'s  behaviour;  the  costs 
of  operations  on  H2  are  defined  analogous  to  Hi. 

The  second  phase  is  recursively  defined  as  the  tail  phase  of  adversary  ^^-~„,  ^  (,  against 
H2;  here  $2  =  (01,02, ..  .,ajt,Qjt/32). 

This  essentially  completes  the  definition  of  the  tail  phase  and  of  the  adversary. 

We  now  mention  some  details  that  had  been  omitted  in  the  definition  of  the  wide 
phase.  During  the  wide  phase,  often,  the  root  hash  function  has  to  be  restricted  to  a 
subdomain  such  as  when  pruning  a  root  hash  function  or  when  choosing  an  appropriate 
subdomain  D'  of  a  pruned  root  hash  function  h  during  a  step  of  the  first  phase.  These 
restrictions  are  performed  in  a  unique  way.  We  prune  every  root  hash  function  in  a 
unique  way  depending  only  on  the  hash  function  and  on  the  bias  /?.  We  choose  domain 
D'  during  a  step  of  the  first  phase  in  a  unique  way  depending  only  on  the  pruned  hash 
function  h  and  U^.  In  the  construction  of  the  random  subsets  S  C  U^  during  the  first 
phase,  we  uniformly  partition  f/^  in  a  unique  way  depending  only  on  its  value. 
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The  size  of  the  universe  reduces  by  a  factor  of  USm/a'^  when  carrying  out  the 
recursion  during  the  wide  phase.  We  need  \U\  >  (32)''^(128m)'"/7/Q^''  to  ensure  that  the 
universe  has  at  least  n  elements  by  the  time  the  adversary  reaches  the  r-th  level. 

The  parameters  e,  ex,  ej,  W'l.  and  \V2  are  defined  as  functions  of  the  adversary 
parameters  r,  a,  and  n  using  the  following  recurrence  relations: 

^("'")  =  [n  otherwise                          '^'^^ 

e\{Q,n)  =  e'-\al2,n)  (2.2) 

e\(a,n)  =  aW{{a,n)l(cow)  (2.3) 

Wlia.n)  =  W'-\al2,n)  (2.4) 

W'ia,n)    =     |»'l(-»W(c;*)    if|->2 
^         '  1    anjc\b  otherwise 

W'^ia^n)     =     W'-\al22,ae'^{a,n)li)  (2.6) 

The  constants  cq  and  C\  used  in  these  recurrences  will  be  specified  later.   We  obtain  a 
recurrence  involving  W  alone  by  eliminating  the  other  parameters: 

Ty>,n)=       ci6  """ if'->2 

I,   an/cib  otherwise 

This  recurrence  has  the  following  solution: 

2""+'  -3 


Back-substituting  this  solution  into  the  above  recurrences,  we  find  the  values  of  param- 
eters e,  ei,  62,  W^i,  and  W2  to  be  approximately  n{a/wb)^'^ . 

The  adversary  definition  requires  all  the  parameters  to  be  at  lea.st  1.  We  need  n  to 
be  sufficiently  large  (n  as  (wb/a)^  )  to  guarantee  this. 

We  state  some  useful  facts  about  the  adversary.  These  facts  can  be  proved  using  the 
definition  of  the  adversary  and  induction. 

Lemma  2.3  i.  W{Q,n)  <  W^{a,n)/2  <  W{{Q,n)/2,  for  all  r,  a,  and  n. 
ii.   (6/0)62(0,71)  <  e\{a,n),  for  all  r,  a,  and  n. 
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Lemma  2.4   i.    The  maximum  number  of  insertions  performed  by  -4[^  „  ^,  (,  25  at  most  3n. 

a.  The  maximum  number  of  random  insertion  batches  performed  by  .-IJ^  „  ^,  j,  is  at  most 
(34)'-Vq. 

Hi.  The  maximum  number  of  insertions  performed  by  A'^^^.^j^  during  the  second  phase 
is  at  most  e2- 

iv.  The  maximum  number  of  insertions  performed  by  A^^  ^, ,,  during  the  wide  phase  is 
at  most  (6/a)e2  <  Ci- 

Lemma  2.5  i.  A  complete  adversary  sequence  leaves  the  hashing  scheme  in  an  a-dense 
configuration:  a  truncated  adversary  sequence  leaves  the  hashing  scheme  in  an  (q,  e)-bad 
configuration. 

a.  Whenever  the  adversary  performs  a  random  m.'.ertwn  batch,  the  hashing  scheme  is 
in  a  a-dense  configuration. 

Hi.  The  hashing  scheme  always  uses  (a/2,Wi/2)-u-ide  root  hash  functions  in  a  a-dense 
configuration  during  the  wide  phase  of  the  adversary. 

2.3.3     Two  Random  Sampling  Lemmas 

The  analysis  of  the  wide  phase  uses  two  technical  lemmas  for  analyzing  the  behaviour 
of  random  samples  under  a  sequence  of  partial  ha.sh  functions,  so,  first,  in  this  section, 
we  state  and  prove  these  lemmas. 

We  need  some  definitions  before  stating  the  lemmas.  Consider  partial  hash  functions 
from  a  universe  U  to  an  array  of  m  locations.  .\n  n-hash  function,  for  any  a  €  [0, 1], 
is  a  partial  hash  function  that  is  defined  on  a  domain  of  density  a  m  U.  A  sequence  of 
partial  hash  functions  is  called  a  hashtopy;  we  think  of  a  hashtopy  as  a  deformation  of 
a  partial  hash  function  over  time,  where  time  signifies  the  integers  from  1  to  the  length 
of  the  hashtopy.  For  a  hashtopy  7t,  Ht  denotes  the  /ih  partial  hash  function  in  the 
hashtopy.  A  hashtopy  consisting  of  only  o-hash  functions  is  called  an  a-hashtopy;  a 
0-biased  hashtopy  is  defined  analogously.  Consider  a  hashtopy  H.  A  location  snap  of 
K  is  a  pair  (l,t),  where  1  <  /  <  m  and  1  <  <  <  |W|.  A  set  5  C  (7  refivshes  a  location 
snap  (/,<)  of  n  if,  for  some  element  x  £  S,'H  sends  i  to  location  /  at  time  t  and  H  had 
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sent  I  to  a  different  location  the  last  time  before  t  when  x  appeared  in  H.  The  refresh 
cost  incurred  by  a  set  5  C  U  in  H  is  defined  as  the  total  number  of  locations  snaps  oi  H 
refreshed  by  5.  The  oscillation  of  an  element  x  €  U  in  H  is  defined  as  the  refresh  cost 
incurred  by  x  in  Ti.  The  oscillation  of  H  is  defined  as  the  mean  oscillation  of  an  element 
of  U  in  H.  A  fixed  element  of  H  is  defined  as  an  element  whose  oscillation  in  H  equals 
0;  a  fixed  subset  of  H  is  defined  analogously. 

Our  first  lemma  estimates  the  refresh  cost  incurred  by  a  random  sample  in  a  /?- 
biased  hcishtopy  when  the  sample  is  constructed  by  uniformly  partitioning  the  universe 
and  sampling  each  partition  independently. 

Lemma  2.6  (Refresh  Cost  Lemma)  Consider  a  0-biased  hashtopy  Ti  from  a  uni- 
verse U  to  an  array  of  m  locations.  Let  T  =  |H|,  let  n  >  l/l3  be  a  positive  integer,  and 
let  u  =  the  oscillation  ofH.  Partition  U  into  n  equal-sized  blocks,  and  select  a  random 
n-subset  R  from  U  by  picking  a  random  element  from  each  block.  R  incurs  a  refresh  cost 
of  at  least  u)/40  in  Ti  with  probability  at  least  (1  -  e"*"'^)/''*^*^-'"). 

Our  second  lemma  says  that  a  random  sample  from  a  low  oscillation  a-hashtopy 
is  likely  to  intersect  the  domains  of  the  hash  functions  in  the  hashtopy  in  dense  fixed 
subsets. 

Lemma  2.7  (Density  Lemma)  LetH  =  {hi,h2 hp+q)  be  a  [\ I p) -hashtopy  from 

a  universe  U  to  an  array  of  m  locations  in  which  the  domains  of  hi,h2, . .  .,hp  are  all 
disjoint.  Suppose  that  the  oscillation  ofH  is  at  most  (l/2p).  Let  k  be  a  positive  integer, 
let  f  <  {l/Akmp^),  and  suppose  that  \U\  >  Ife.  Construct  a  random  sample  R  of  U  by 
picking  a  random  (f,  k)-sample  from  each  of  the  hash  functions  /ii,  /i2, .  • . ,  /ip  and  taking 
the  union  of  these  samples.  Let  F  =  the  set  of  elements  of  U  that  are  left  fixed  by  H. 
With  probability  >  (1  -  2q€-''^^^^),  FnRn  dom{h,)  is  a  {l/8p)-dense  subset  of  R.  for 
a//i  >  p+  1. 

We  need  some  further  lemmas  for  proving  the  above  lemmas. 

Lemma  2.8  (Martingale  Lemma)  Let  n  be  a  positive  integer  and  letO  <  0  <  a  <  I. 
Let  Xi,  X2,  ■  ■  ■,Xn  be  a  sequence  of  random  variables  in  the  range  [0, 1]  that  are  exposed 
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one  by  one  and  satisfy  the  following  relation: 

E[X\]  +  E[X2\X,]  +  £[.Y3|A'i,A'2]  +  •  •  ■  +  £[.Vn|A'i,A'2 A'n-i]  >  an. 

Pr[Xi  +  A'2  +  ■  •  •  +  A'n  >  l3n]  >  (1  -  e"'^"'^"),  where  Cq,0  was  defined  in  the  Binary 
Sampling  Lemma. 

Proof.  We  generalize  the  proof  of  HoefTding's  inequality  [20]  that  gives  the  lemma 
when  the  random  variables  are  mutually  independent  and  have  fixed  means.  We  show 
that 

£[eMA-i+A'2+...+A'n)]  <  (1  _  Q  +  c^eM",     for  all  /i  <  0, 

and  then  complete  the  proof  of  the  lemma  as  in  Hoeffding's  inequality.  We  prove  this 
statement  by  induction  on  n  using  Hoeffding's  ideas,  namely  the  convexity  of  e^  and  the 
inequality  between  arithmetic  and  geometric  means. 

Basis,  n  =  1:  For  any  x  in  the  range  [0, 1],  the  convexity  of  e^^  gives 

e''^     <     l-x-\-xe^,     so  taking  expectations,  we  get 
E[e''^']     <     1  -  ^[A'l]  +  £[A'i]e'' 

<     l-Q  +  oe'',     since /i  <  0. 

Induction  Step,  n  >  2:  The  idea  is  to  expose  .Yj  first  and  apply  induction  to  the 
sequence  .Y2, . . . ,  X„: 


n-  1 


<     £xJe^^'(l-""~T^^(l-e^)r"] 


(by  induction) 


hwr-,        Q"  -  £'[A'i]^^        ,/l^^n-l 


<  (l-E[X,]{l-e')){l- ^(l-e")) 

n  —  1 

(by  convexity  of  e^^) 

<  (l-a  +  ae'')" 

(by  the  arithmetic  and  geometric  means  inequality). 


This  completes  the  proof  of  the  lemma.    D 
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Lemma  2.9  (Fractional  Sampling  Lemma)   Letk^,k2, . .  .,kt  be  positive  integers  with 

sum  k.  let  Qi,Q2 ctt  be  values  in  the  interval  [0, 1]  with  weighted  mean  a  =  {kiOi  + 

•  ■  ■  +  ktQt)/k.  and  let  0  <  J  <  Q.   Consider  a  family  of  disjoint  populations  Ui,  U2 ^'( 

of  values  in  the  interval  [0,1]  with  means  01,02,  .-•  ,Q(,  respectively.  Construct  a  ran- 
dom k-subset  R  by  picking  a  random  k, -subset  from  U,,  for  each  i,  and  taking  the  union 
of  these  subsets.  R  has  a  mean  greater  than  0  with  probability  at  least  (1  -  e"'^'^*^*), 
where  Ca,p  was  defined  in  the  Binary  Sampling  Lemma. 

Proof.  Suppose  R  is  constructed  by  selecting  a  sequence  of  random  values  Vn,  i'12,  -  •  • .  i'l/ti 
from  I'l ,  y'21 ,  y22<  •  •  • ,  y2k2  from  U2.  and  so  on.  Define  a  set  of  independent  random  vari- 
ables {A',_,|l  <  i  <  k,}  as  follows:  A',_,  is  simply  a  random  value  chosen  from  [/,.  A  result 
of  Hoeffding  [20]  (Theorem  4)  says  that 

Ef{Y,Y„)<Ef(Y,X,,) 
for  any  convex  function  /  and  for  all  i.  For  all  real  numbers  /i,  we  have: 


Ee^^'i^'"     =     llEe^^J^" 


(as  {J2j  ^jjj  =  1,.  ..,<}  are  mutually  independent) 

t 

(by  convexity  of  e'*^  and  by  Hoeffding's  result) 

(since  A',j  are  mutucilly  independent). 

We  use  this  inequality  and  complete  the  proof  of  the  lemma  as  in  Hoeffding's  inequal- 
ity [20].    D 

We  are  ready  to  prove  the  main  lemmas  of  this  section. 

Proof  of  the  Refresh  Cost  Lemma.  The  basic  idea  behind  the  proof  is  to 
construct  the  random  sample  R  incrementally  and  use  the  Martingale  Lemma.  Let 
Ui,U2,---,Un  denote  the  blocks  of  the  partition  of  U.  Construct  R  incrementally  by 
randomly  selecting  elements  from  the  blocks,  one  by  one.  Denote  the  set  of  first  i  el- 
ements added  to  R  by  Ri.    For  each  i  6  {l,2,...,n},  define  a  randotn  variable  X,  = 
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the  refresh  cost  incurred  by  R,  in  Ti  minus  the  refresh  cost  incurred  by  R,_i  in  H-  The 
refresh  cost  incurred  by  R  in  Ti  equals  A'l  +  A'2  +  •  •  •  +  A'n- 

In  order  to  apply  the  Martingale  Lemma,  we  construct  a  sequence  of  random  variables 
{y',|j  =  1, ....  n}  whose  sum  has  the  same  distribution  as  the  sum  of  the  A',s  for  small 
values  of  the  sums: 

y  ^  i  X,         if  A'l  +  ■  •  •  +  X,  <  w/2/3 
'       1    T  -  1     otherwise. 

Observe  that  J2,  i'i  =  E.  X,  whenever  ^,  A',  <  u/20  and  that  ^,  Y,  >  J2:  A',  always.  It 
follows  that  the  probability  that  XI,  Yi  exceeds  ui/Afi  equals  the  probability  that  Y.,  X, 
exceeds  tj/4/3.  We  normalize  the  Y^s  to  the  interval  [0, 1]  and  form  new  random  variables 
Z,  =  Y,/{T  -  1),  for  all  i.  The  following  lemma  says  that  the  Z,s  satisfy  the  condition 
of  the  Martingale  Lemma. 

Lemma  2.10 

E[Z{\  +  E[Z2\Rx]  +  •  •  •  +  £[Z„|i?i,  iZ2,  •  ...Rn-x]  >  ^[T  -  l)' 

By  the  Martingale  Lemma  and  using  n  >  1//?,  we  conclude  that  the  probability  that 
X;.-  Zi  exceeds  u;/(4/3(T  -  1))  is  at  least  (1  -  e-("'^)/(i6(7'-i))).  jhe  Refresh  Cost  Lemma 
follows  immediately. 

It  remciins  to  prove  Lemma  2.10. 

Proof  of  Lemma  2.10.  We  need  some  definitions.  For  any  set  S  C  U  and  element 
X  e  [/,  the  oscillation  of  x  modulo  S  is  defined  as  the  refresh  cost  incurred  by  5  U  {x} 
minus  the  refresh  cost  incurred  by  S.  For  any  pair  of  sets  5,  T  C  (/,  the  oscillation  ofT 
modulo  S,  denoted  u)s{T),  is  defined  as  the  mean  oscillation  of  an  element  of  T  modulo 
5. 

Consider  the  construction  of  the  sample  R  by  adding  elements  from  the  blocks,  one 
by  one.  Define: 

w.-     =     (u>fi._,(f/.)  +  Wfi._,(t/,+i)  +  ---  +  Wfi._i(f/„))/n,     and 
K    =     max{t\Xi  +  A'2  +  •  •  ■  +  Xt  <  u/2/3}. 

We  have 

CJ.+x  <  (n  -  k){T  -  \)ln  =  {£[Z,+i|i?«]  +  E[Z.^2\R.^x\  +  •  •  •  +  E[Zn\Rn-i\){T  -  l)/n. 
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For  any  i  <  k,  we  have 

<     E[Z^R,_l](T-l)/n  +  ^iX, 

(as  each  newly  refreshed  location  snap  at  step  (  reduces 
the  oscillation  of  at  most  a  /j-fraction  of  U). 

Summing  i  from  1  to  k,  we  get 

UJ      =      (Wi  -  U>2  )  +  (4^2  -  tJ3 )  + 1-  (Wr  -  W^  +  i)  +  u)«+i 

<  {E[Z,]  +  E[Z2\Ri]  +  ■■■  +  E[Zn\Rn-i])(T  -  l)/n  +  /3(.Yi  +  A'2  +  •  •  •  +  A',) 

<  (E[Zi]  +  E[Z2\Ri]  +  ---+E[Zr^\Rn-i]){T-  l)/n+cZ'/2. 

The  lemma  follows  from  this  inequality.    D 

This  completes  the  proof  of  the  Refresh  Cost  Lemma.    D 

Proof  of  the  Density  Lemma.  Since  the  oscillation  of  W  is  at  most  l/2p,  it  follows 
that  |F|  >  (1  -  ]./2p)\U\.  Thus  Fndom(/i_,)  has  a  density  of  at  least  l/2pin  U,  for  all  j. 
We  complete  the  proof  of  the  lemma  by  showing  that  the  intersection  of  R  with  any  fixed 
(l/2p)-dense  subset  of  [/  is  a  ( l/8p)-dense  subset  of  R  with  probability  >  ( 1  -  2e~*'/^^^). 

Let  5  be  any  (l/2p)-dense  subset  of  U.  We  want  to  estimate  the  probability  that 
iZ  n  5  is  a  (l/8p)-dense  subset  of  R.  Let  D  denote  the  set  of  elements  that  are  deleted 
from  the  domains  of  hash  functions  h\,...,hp  when  R  is  constructed  by  taking  (e, A;)- 
samples  of  these  hash  functions.  Since  (  <  {l/Akmp^),  it  follows  that  \D\  <  \U\/4p. 
Define  Si  =  S\D;  Si  is  a  (l/4p)-dense  subset  of  T.  We  show  that  SiD  Ris  likely  to  be 
(l/8p)-dense  in  R  using  two  successive  applications  of  the  Fractional  Sampling  Lemma. 

Let  us  review  the  construction  of  R.  R  is  constructed  by  picking  a  random  A;-subset 
from  each  of  the  sets  f/,  =  dom(/i,)\D,  for  j  <  p.  by  forming  the  union  Ri  of  these 
subsets,  by  picking  dense  subsets  of  U  that  go  into  the  same  locations  as  the  elements 
of  Ri,  and  by  forming  the  union  of  these  dense  subsets. 

For  each  element  x  6  dom(/i,),  define 

value(x)=l^'''^-'-""-^-l. 
\h;'{h.(i))\ 

We  have 


f:xg[;.value{i)     >     ^^gdom(/..)^'^"^(^)'     ^^^ 
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y^,  E'j-pff  value(x)  „  ,     ,    ,  ^   ,  ,, 

P 
By  the  Fractional  Sampling  Lemma,  it  follows  that  Ri  has  a  mean  value  of  at  least  l/6p 
with  probability  >  (1  -  e"''/'^). 

We  estimate  the  probability  that  RCi  Si  is  ( l/8p)-dense  in  i?,  given  that  Ri  has  a 
mean  value  of  at  least  l/6p.  For  all  i  6  {1,2 p}  and  all  /  G  {1.2 m},  define 

[/,  (     =     h~^l    and 
A:,,;     =     |i?i  n  [':.;|. 

Define  the  characteristic  function  ,\5j  of  set  5i: 

J   1     if  X  6  S, 
^^'^^^       [  0    otherwise 

Since  the  mean  value  of  Ri  is  at  least  l/6p,  it  follows  that 

E.,/^'i,;^x6t'.,,\s,(^)  .   .  ,„ 

r >  l/6p. 

kp 

Since  R  is  formed  by  picking  a  random  (fc,,;€|[/|)-subset  of  U,,i  and  taking  the  union  of 
these  subsets,  by  the  Fractional  Sampling  Lemma,  it  follows  that  RnSi  is  a  (l/8p)-dense 
subset  of  R  with  conditional  probability  >  (1  -  e"''/'^^),  given  that  Ri  has  a  mean  value 
>  l/6p. 

We  conclude  that  R  n  Si  is  a  (l/8p)-dense  subset  of  R  with  probability  >   (1  - 
2g-*:/i92^    This  completes  the  proof  of  the  lemma.    D 

2.3.4     The  Worst-case  Lower  Bound 

In  this  section,  we  analyze  the  adversary  defined  in  Section  2.3.2  and  prove  a  lower  bound 
of  n(log(logn/  log  6))  on  the  worst-case  cost  of  dictionary  operations  in  the  refresh  cost 
multilevel  hashing  model,  where  n  =  the  maximum  dictionary  size  during  the  operation 
sequences. 

Let  H  be  any  partial  hashing  scheme.  We  say  that  H  succeeds  on  a  sequence  of 
operations  a  performed  by  adversary  A^^^^^^j^  if  cr  is  a  complete  sequence  of  operations, 
the  maximum  cost  of  an  update  operation  in  a  is  at  most  w,  and  the  maximum  level  of 
a  dictionary  element  during  a  is  at  most  r. 
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The  following  lemma  bounds  the  success  probability  of  a  partial  hashing  scheme 
against  the  adversary. 

Lemma  2.11    Let  q  =  (01,02 ctk)  be  a  sequence  of  values  in  the  interval  [Q A],  let 

a  —  Qfc,  and  let  W  =  W{a,n)  >  I.  Let  H  be  a  partial  hashing  scheme  that  has  a 
{Wb)-bit  root  location  and  b-bit  locations  at  levels  >  2.  H  succeeds  against  /l^  „  u^,  (,  with 
probability  <  e~^^^. 

The  lemma  gives  a  trade-off  between  the  worst-case  costs  of  search  operations  and 
update  operations  incurred  by  a  hashing  scheme.  Let  H  he  a.  multilevel  hashing  scheme 
in  the  refresh  cost  model  with  wordsize  6  that  incurs  a  worst-case  cost  of  r  on  searches 
and  a  worst-case  cost  of  w  on  updates  in  processing  sequences  of  0(7?)  operations.  H 
has  the  following  trade-off  between  r,  w,  and  b: 

w  =  Q(n^'^'/b). 

This  trade-off  gives  a  lower  bound  of  r2(log(logn/ log6))  for  max{r,i(;}. 

The  rest  of  this  section  is  devoted  to  proving  Lemma  2.11.  The  proof  is  by  induction 
on  r.  Let  Ui  ^  U2  ^  ■  •  ■  ^  Uk  =  U  be  the  tower  of  universes  of  H . 

Basis,  r  =  1:  Following  a  successful  adversary  sequence,  the  working  subuniverse 
of  H  has  density  >  q  in  C^  Let  S(v)  be  a  fixed  o-dense  working  subuniverse  used 
by  H  following  a  successful  adversary  sequence.  By  the  Fractional  Sampling  Lemma, 
the  random  first  batch  has  >  an/4  elements  in  S(v)  with  probability  >  (1  -  e~""/^^). 
The  root  location  can  store  at  most  Wb  =  an/ci  distinct  elements  of  [/,  so  if  Ci  >  4, 
some  element  of  the  first  batch  is  stored  at  a  level  >  2  with  this  probability.  Since  the 
root  location  can  store  at  most  2'*"^'^'  distinct  values  v,  the  success  probability  of  H  is 
<  2°'n/cie-'>"/i6.  We  choose  ci  >  32  so  that  the  success  probability  of  ^  is  <  e-°'"/"='. 

Induction  Step,  r  >  2:  We  estimate  the  probability  that  H  succeeds  against  the 
adversary: 

Pr[success]     =     Pr[narrow  phase  completes  successfully]  + 
Pr[wide  phase  completes  successfully] 

If  the  narrow  phase  completes  successfully  then  the  induced  hashing  scheme  Hi 
succeeds  against  its  adversary  A^^^^^.  Thus,  by  induction,  the  first  term  is  bounded 
by  e-^^>^ 
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We  bound  the  second  term  by  showing  that  any  fixed  sequence  of  root  hash  functions 
H  =  (hi,h2,  ■  ■  -.hi)  used  before  the  random  insertion  batches  during  the  wide  phase  has 
a  low  probability  of  success;  actually,  here  hi  denotes  the  root  hash  function  used  at 
the  end  of  the  adversary  sequence  and  /i/_i  denotes  the  root  hash  function  used  before 
the  last  random  insertion  batch.  The  hash  functions  h,  are  all  (q/2,  iyi/2)-wide,  so 
they  can  be  pruned  to  obtain  (q/H^i  )-biased  hash  functions  g,  by  deleting  subsets  of 
density  <  q/2  in  U  from  their  domains.  The  g,s  are  (Q/2)-hash  functions.  We  apply 
the  incremental  construction  of  the  random  subuniverse  f/jt+i  during  the  first  phase  to 
the  sequence  (gi,g2,  ■  ■  -,91)  and  determine  the  prefix  {gi,g2^  ■  ■  ■  ^9p)  of  the  sequence  from 
which  Uk+i  is  constructed  through  random  sampling.  When  sampling  from  a  pruned 
hash  function  5,,  we  restrict  g,  to  a  subdomain  D'  of  density  a/4  in  U  and  sample  from 
the  restriction  /,  =  g,\D'-  The  domains  of  the  restrictions  /,,  for  I  <  i  <  p,  are  all  disjoint 
and  the  union  of  these  domains  equals  U^ .  The  domains  of  the  hash  functions  g-,,  for 
j  >  P+  1,  intersect  U^  in  a  (Q/4)-dense  subset  of  f/,  since,  otherwise,  the  construction  of 
Uk+i  would  have  also  involved  gj.  We  restrict  each  hash  function  5^,  for  j  >  p+  1,  to  the 
subdomain  dom(gj)  D  U^  and  obtain  a  (Q/4)-hash  function  /j.  Consider  the  hashtopy 
T  =  {f\,  f2, .  ■ . ,  fi)  over  universe  U^ .  Two  cases  arise: 

Case  1.  The  oscillation  of  T  is  at  least  a/8:  By  the  Refresh  Cost  Lemma,  since  T 
is  a  (a/Wi)-biased  hashtopy,  the  first  batch  of  the  adversary  incurs  a  refresh  cost  of  at 
least  M^i/32  in  T  with  probability  at  least  (1  -  e-("°'/i28')).  We  choose  cq  >  192  so  that 
the  total  refresh  cost  available  during  the  wide  phase  is  at  most  6e2w/a  <  Wi/32.  The 
probability  of  success  of  H  is  <  e-("0'/i28/) 

Case  2.  The  oscillation  of  J"  is  at  most  a/8:  Uk+i  is  formed  by  picking  (4€/ap,  1^2/2)- 
samples  from  the  hash  functions  /,,  for  i  <  p,  and  taking  the  union  of  these  samples; 
here  we  have  scaled  e  to  convert  densities  from  U  to  U^.  .F  is  a  (1/p)- hashtopy  over 
f/\  the  oscillation  of  ^  is  at  most  l/2p,  and  (4e/ap)  <  (l/4fcmp2).  Lg^  f  _  ^^e  set 
of  elements  of  U^  that  are  left  fixed  by  T.  By  the  Density  Lemma,  with  probability 
>  (1  -  2/6-^2/^"),  F  n  Uk+i  n  dom(/i,)  is  (a/32)-dense  in  Uk+i,  for  all  i  >  p  +  1.  We 
call  this  property  of  Uk+i  as  the  density  property. 

Fix  a  sequence  of  insertions  a  performed  by  the  adversary  prior  to  the  wide  phase 
and  fix  a  subuniverse  Uk+i  chosen  by  the  adversary  with  the  density  property.  Under 
these  conditions,  if  H  succeeds  against  the  adversary,  then  the  induced  hashing  scheme 


30  CHAPTER  2.    THE  DICTIONARY  PROBLEM 

Ho  succeeds  against  its  adversary  A'l-"\  ^  the  second  phase  can  not  get  truncated 
because  Uk+\  has  a  dense  fixed  intersection  with  dom(/i;).  By  induction,  under  the 
abo\-e  conditions.  H  succeeds  against  the  adversary  using  Ti  with  probability  <  e~^^^''. 

In  both  cases,  the  probability  that  H  succeeds  using  root  hashtopy  H  is  at  most 

for  sufficiently  large  Ci.  The  total  number  of  root  hashtopies  Ti  available  is  at  most 
2^6(34)^-'/^  If  ^ve  let  W  <  (1^20/1000(34)'"- '6)  (by  making  cj  large  enough),  then  the 
probability  that  H  successfully  completes  the  wide  phase  is  at  most  g-^^'s/iooo 

The  probability  that  H  succeeds  against  the  adversary  is  at  most  e-"'i''  +  e-''^'2''^°°°  < 
e-'*^*,  since  W  is  small  relative  to  Wi  and  W2.  This  completes  the  proof  of  the  lemma. 

2.3.5     Amortization 

In  this  section,  we  prove  a  lower  bound  of  fi(log(logn/ log6))  on  the  amortized  cost 
of  dictionary  operations  in  the  refresh  cost  multilevel  hashing  model,  where  n  =  the 
maximum  dictionary  size  during  the  operation  sequences. 

Any  hashing  scheme  H  with  an  amortized  cost  of  r  on  searches  and  an  amortized 
cost  of  w  on  updates  can  be  converted  into  a  hashing  scheme  H'  with  a  worst-case  cost 
of  r  on  searches  and  an  amortized  cost  of  w  on  updates.  H'  simulates  the  behaviour  of  H 
in  processing  all  the  operations,  but  always  stores  the  dictionary  elements  in  the  first  r 
levels.  A  search  operation  is  performed  by  H'  by  simulating  H ,  but  H'  does  not  change 
the  memory  configuration  even  if  H  does.  An  update  operation  is  performed  by  H'  by 
simulating  H  and  then  compressing  the  dictionary  to  the  first  r  levels  by  performing 
sufficiently  many  greedy  searches  that  each  have  a  search  cost  >  r  -)-  1.  Consider  a 
sequence  cr  of  u  update  operations  performed  on  H'.  a  translates  into  a  sequence  a  of 
u  update  operations  and  g  greedy  searches  on  H .  Let  w^  —  the  refresh  cost  of  the  i-th 
operation  in  a.  The  cost  of  &  on  H  is  at  least  ^,  w,  +  g{r  +  1)  and  at  most  gr  +  uw,  so 
it  follows  that  ^,  w,  <  uw.  We  conclude  that  H'  incurs  a  worst-c3ise  cost  of  r  on  search 
operations  and  an  amortized  cost  of  w  on  update  operations. 

We  complete  the  proof  of  Theorem  2.1  by  showing  that  a  hashing  scheme  incurs 
either  a  worst-case  cost  of  r  on  searches  or  an  amortized  cost  of  Q(n*/'^  /2^  6)  on  updates 
in  processing  operation  sequences  of  length  0(n).  We  modify  the  worst-case  adversary 
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A-  ^,  f,  by  appropriately  performing  greedy  insertion  batches  following  random  insertion 
batches.  A  configuration  C  of  a  partial  hashing  scheme  H  is  said  to  be  xv -amortized. 
for  a  positive  integer  u'.  if  the  cost  incurred  by  H  in  processing  any  sequence  a  of 
update  operations,  starting  from  configuration  C,  is  at  most  |cr|u'.  The  new  adversary 
'^-  n  u;  6  '^  tailored  against  (Q,e)-good  partial  hashing  schemes  that  start  processing  the 
adversary  sequence  in  a  w-amortized  configuration;  here  e  is  a  suitably  defined  positive 
integer.  Either  the  adversary  performs  a  complete  sequence  of  0(n)  insertions,  or  it 
gets  truncated  because  the  scheme  has  entered  a  (d,e)-bad  configuration  or  because  the 
scheme  was  not  in  a  ly-amortized  configuration,  initiaUy. 

We  define  adversary  A^'^nwh  against  a  partial  hashing  scheme  H\  let  q  denote  the 
last  component  of  q.  A  w-greedy  insertion  batch  performed  against  a  hashing  scheme  H 
in  a  configuration  C  is  defined  to  be  a  maximal  sequence  a  of  insertions,  starting  from 
configuration  C,  during  which  H  incurs  an  update  cost  of  at  least  w\a\.  The  adversary 
performs  a  random  first  batch  just  like  the  worst-case  adversary,  then  performs  a  (2w)- 
greedy  insertion  batch,  and,  finally,  performs  the  tail  phase;  the  adversary  announces 
truncation  even  before  performing  the  first  batch  if  the  initial  configuration  of  H  is  not 
u?-amortized.  The  tail  phase  is  defined  recursively,  essentially  as  before,  and  consists 
of  a  narrow  phase,  possibly,  followed  by  a  wide  phase;  in  the  case  r  =  1,  as  before, 
the  tail  phase  is  a  suitable  extension  batch  that  takes  H  to  an  a-dense  configuration. 
The  narrow  phase  consists  of  the  tail  phase  of  the  recursive  adversary  A.'jf^^  ^  ^  performed 
against  the  induced  hashing  scheme  Hi  that  is  constructed  as  before.  If  the  narrow  phase 
gets  truncated,  then  Hi  has  entered  an  (Q,ei)-bad.  (2^'"^u;)-amortized  configuration. 
Equivalently,  H  has  entered  a  (2^"^"  tij)-amortized  configuration  such  that  H  uses  only 
[ait/ 2,  Wi/2)-wide  root  hash  functions  in  any  o-dense  configuration  during  the  next  ej 
insertions.  The  wide  phase  consists  of  a  first  phase  followed  by  a  second  phase  that 
are  defined  slightly  differently  from  the  worst-case  adversary.  During  the  first  phase, 
following  each  batch  of  random  insertions,  we  perform  a  (2^'~  +^u;)-greedy  insertion 
batch  so  that,  at  the  end  of  the  first  phase,  /^  is  in  a  (•2^'"^"^'u')-amortized  configuration. 
The  second  phase  is  recursively  defined  to  be  the  tail  phase  of  adversary  ^^~^,  2^^-'^wb 
performed  against  the  induced  hashing  scheme  Hi  that  is  constructed  as  before.  This 
completes  the  definition  of  A^  „  ^^  (,. 

We  define  parameters  e,  ei,  62,  W,  Wi,  and  W'i  as  functions  of  r,  a,  n,  and  w.  Only 
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the  definition  of  e^lo.n.u')  has  to  be  modified  since  it  depends  on  w: 

e5(a,  n,  w)  =  aWl(a,  n,  u')/(co2^'"'  w)  (2.7) 

The  recurrence  for  W  becomes: 


[   an/cib  otherwise 

This  recurrence  has  the  following  solution: 

^2""+' -3 


Vy  (q,  71,  w)  =  n- 


Hence  the  values  of  the  parameters  is  approximately  n(Q/2^  u;6)^  . 

The  Lemmas  2.3,  2.4,  and  2.5  still  hold  for  ^^  „  ^,  (,  with  the  exception  of  Lemma  2.4, 
Parts  i.  and  iv.  We  modify  some  parts  of  the  lemmas  as  follows: 

Lemma  2.3  ii\  (10/a)e2(a,  n,  w)  <  e\{a,  n,  w),  for  all  r.  a,  n,  and  w. 

Lemma  2.4  i'.  The  maximum  number  of  insertions  performed  ^J/  -^a  „  ^,  j,  is  at  most  4n. 
iv'.  The  maximum  number  of  insertions  performed  by  /^^  „  „;  (,  during  the  wide  phase  is 
at  most  (l0/a)e2  <  Cj. 

Lemma  2.5  i'.  A  complete  adversary  sequence  leaves  the  hashing  scheme  in  an  a-dense 
configuration;  a  truncated  adversary  sequence  leaves  the  hashing  scheme  in  a  (2^  w)- 
amortized  (a,e)-bad  configuration. 

Lemma  2.11  still  holds  for  A!^  „  ,  l-. 

Lemma  2.12  Let  a  =  (01,02, ..  .,Qk)  be  a  sequence  of  values  in  the  interval  [0,1],  let 
Q  =  Ok,  and  let  W  =  W''{Q,n,w)  >  1.  Let  H  be  a  partial  hashing  scheme  that  has  a 
iWb)-bit  root  location  and  b-bit  locations  at  levels  >2.H  succeeds  against  ^4^  „  „,  j,  with 
probability  <  e'^'' . 

This  lemma  gives  a  trade-off  between  the  worst-case  cost  of  searches,  r,  and  the 
amortized  cost  of  updates,  w,  incurred  by  a  multilevel  hashing  scheme  in  the  refresh  cost 
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model  in  processing  sequences  of  0(n)  operations: 

The  lower  bound  of  Theorem  2.1  follows  from  this  trade-off  and  our  procedure  for  convert- 
ing an  amortized  hashing  scheme  into  a  hashing  scheme  with  a  worst-case  cost  guarantee 
for  search  operations. 

Lemma  2.12  is  proved  in  the  same  way  as  Lemma  2.11.  The  only  part  of  the  proof 
that  changes  due  to  amortization  is  Case  1  of  the  induction  step.  In  this  case  the  total 
refresh  cost  available  during  the  wide  phase  is  at  most  10622^"^  w/a.  We  choose  cq  >  320 
so  that  this  quantity  is  less  than  Wi/Z2,  which  is  the  probable  refresh  cost  incurred  by 
the  first  batch.  The  rest  of  the  proof  remains  as  before. 
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Chapter  3 

The  Deque  Conjecture 


Splay  is  an  ctlgorithm  for  searching  binary  search  trees,  devised  by  Sleator  and  Tarjan, 
that  reorganizes  the  tree  by  means  of  rotations.  Sleator  and  Tarjan  conjectured  that 
Splay  is,  in  essence,  the  fastest  algorithm  for  processing  any  sequence  of  search  operations 
on  a  binary  search  tree,  using  only  rotations  to  reorganize  the  tree.  Tarjan  proved  a 
special  case  of  this  conjecture,  called  the  Scanning  Theorem,  and  conjectured  a  more 
general  special  case,  called  the  Deque  Conjecture. 

In  this  chapter,  we  prove  tight  bounds  for  some  combinatorial  problems  involving 
rotation  sequences  on  binary  trees,  derive  a  result  that  is  a  close  approximation  to  the 
Deque  Conjecture,  and  give  two  new  proofs  of  the  Scanning  Theorem^. 

3.1      Introduction 

We  review  the  Splay  Algorithm,  its  conjectures  and  previous  works  on  them,  and  describe 
our  results. 

3.1.1      The  Splay  Algorithm  and  Its  Conjectures 

Splay  is  a  simple,  efficient  algorithm  for  searching  binary  search  trees,  devised  by  Sleator 
and  Tarjan  [28].  A  splay  at  an  element  a:  of  a  binary  search  tree  first  locates  the  element 
in  the  tree  by  traversing  the  path  from  the  root  of  the  tree  to  the  element  (called  the 
access  path  of  the  element)  and  then  transforms  the  tree  by  means  of  rotations  in  order 


*The  work  of  this  chapter  was  reported  in  [30]. 
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to  speed  up  future  searches  in  the  vicinity  of  the  element.  The  splay  transformation 
moves  element  x  to  the  root  of  the  tree  along  its  access  path  by  repeating  the  following 
step  (See  Figure  3.1): 

Splay  step. 

Let  p  and  g  denote,  respectively,  the  parent  and  the  grandparent  of  x. 
Case  1.  p  is  the  root:  Make  x  the  new  root,  by  rotating  the  edge  [x,p]. 
Case  2.   [x,p]  is  a  left  edge  (i.e.  an  edge  to  a  left  child)  and  [p.g]  is  a  right 
edge,  or  vice  versa:  Rotate  [i,p];  Rotate  [x.g]. 

Case  3.  Either  both  [x,p]  and  [p,g]  are  left  edges,  or  both  are  right  edges: 
Rotate  \p,g];  Rotate  [x,p]. 

Sleator  and  Tarjan  proved  that  Splay  is.  upto  a  constant  factor,  as  efficient  as  the 
more  complex  traditional  balanced  tree  algorithms  for  processing  any  sequence  of  binary 
search  tree  operations.  They  also  showed  that  Splay  actually  behaves  even  faster  on 
certain  special  kinds  of  sequences  and  conjectured  that  Splay  is,  upto  a  constant  factor, 
the  fastest  rotation-based  binary  search  tree  algorithm  for  processing  any  sequence  of 
searches  on  a  binary  search  tree.  We  state  this  conjecture  and  some  closely-related 
conjectures: 

Conjecture  3.1  (Dynamic  Optimality  Conjecture  [28])  Let  s  denote  an  arbitrary 
sequence  of  searches  of  elements  in  a  given  n-node  binary  search  tree.  Define  x(s)  equal 
to  the  minimum  cost  of  executing  sequence  s  on  the  tne  using  an  algorithm  that  performs 
searches,  incurring  a  cost  equal  to  [l  +  the  distance  of  the  element  from  the  root)  on  each 
search,  and  transforms  the  tree  by  means  of  single  rr)lattons,  incurring  unit  cost  per  single 
rotation.  Splay  takes  0{n  +  x{s))  time  to  process  .<. 

Conjecture  3.2  (Deque  Conjecture  [33])  Dfque  operations  on  a  binary  tree  trans- 
form the  tree  by  inserting  or  deleting  nodes  at  ihi  left  or  right  end  of  the  tree.  We 
perform  deque  operations  on  a  binary  tree  using  Sptay  as  follows  {See  Figure  3.2):  PoP 
splays  at  the  leftmost  node  and  removes  it  from  tht  trre:  Push  inserts  a  new  node  to  the 
left,  making  the  old  tree  its  right  subtree;  EJECT  and  Inject  are  symmetric  operations 
performed  at  the  right  end.  Splay  takes  0{m  +  n)  time  to  process  a  sequence  of  m  deque 
operations  on  an  n-node  binary  tree. 
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Case  1. 
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Case  2. 


A  B  C  D 


B  C 


Case  3. 


A  B 


Figure  3.1:  A  splay  step. 
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Conjecture  3.3  (Right  Turn  Conjecture  [33])  Define  a  right  ■2-turn  on  a  binary 
tree  to  be  a  sequence  of  two  right  single  rotations  performed  on  the  tree  in  which  the 
bottom  node  of  the  first  rotation  coincides  with  the  top  node  of  the  second  rotation  (See 
Figure  3.3).  In  a  sequence  of  right  2-turns  and  right  single  rotations  performed  on  an 
n-node  binary  tree,  there  are  only  0{n)  right  2-turns. 


© 


© 


© 


Pop 


Figure  3.2:  The  deque  operations. 


Inject(6) 


The  conjectures  are  related  as  follows.  A  stronger  form  of  the  Dynamic  Optimality 
Conjecture  that  allows  update  operations  as  well  as  search  operations  implies  the  Deque 
Conjecture.  The  Right  Turn  Conjecture  also  implies  the  Deque  Conjecture. 
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3.1.2     Terminology 

We  define  the  basic  terminology  used  in  the  chapter.  A  binary  .•search  tree  over  an 
ordered  universe  is  a  binary  tree  whose  nodes  are  assigned  elements  from  the  universe 
in  symmetric  order:  that  is,  for  any  node  x  assigned  an  element  e.  the  elements  in  the 
left  subtree  of  x  are  lesser  than  e  and  the  elements  in  the  right  subtree  of  x  are  greater 
than  e.  The  path  between  the  root  and  the  leftmost  node  in  a  binary  tree  is  called  the 
left  path.  A  tree  in  which  the  left  path  is  the  entire  tree  is  called  a  left  path  tree.  The 
edge  between  a  node  and  its  left  child  in  a  binary  tree  is  called  a  left  edge.  The  paths 
in  a  binary  tree  that  comprise  only  left  edges  are  called  left  subpaths.  The  left  depth  of 
a  node  in  a  binary  tree  is  defined  to  be  the  number  of  left  edges  on  the  path  between  the 
node  and  the  root.  The  terms  right  path  tree,  right  path,  right  edge,  right  subpath,  and 
right  depth  are  defined  analogously.  A  single  rotation  of  an  edge  [x,p]  in  a  binary  tree 
is  a  transformation  that  makes  x  the  parent  of  p  by  transferring  one  of  the  subtrees  of 
x  to  p  (See  Figure  3.3).  A  single  rotation  is  called  right  or  left,  respectively,  according 
to  whether  [x,p\  was  originally  a  left  edge  or  a  right  edge.  A  rotation  on  a  binary  tree 
is  a  sequence  of  single  rotations  performed  on  the  tree.  A  rotation  is  called  left  or  right, 
respectively,  if  it  consists  solely  of  left  single  rotations  or  solely  of  right  single  rotations. 
A  double  rotation  on  a  binary  tree  is  a  sequence  of  two  single  rotations  performed  on  the 
tree  that  have  a  node  in  common  (as,  for  instance,  by  a  splay  step).  A  left  path  rotation 
on  a  binary  tree  is  a  right  single  rotation  performed  on  the  left  path  of  the  tree.  A  right 
path  rotation  is  defined  analogously. 

We  define  the  Ackerman  hierarchy  of  functions  {Ai\i  >  0},  its  inverse  hierarchy 
{q,|z  >  0},  and  inverse  functions  a  and  a  of  the  Ackerman  function  as  follows: 

Aoij)    =     2j    for  all  j  >  1 
Aiij)    =     2^    for  all  J  >  1 

_     /   >l.-i(2)  iff  >2and  j  =  1 

'^^'  \  A,.i{A,(j  -  I))    ifz>2andj>2 

ai{n)     =     minj/:  >  l|A,(fc)  >  n]    for  all  n  >  1 
a{n)     =     min{Jk>  l|>lt(l)  >  n}    for  all  n  >  1 
a{m,n)     =     min{fc  >  l|/lfc(Lm/TiJ)  >  logn}    for  all  m  >  n  >  1 
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i.  A  single  rotation 
P 


C 


A         B 

ii.  A  right  3-twist 


.% 


d 


B        C 


iii.  A  right  3-turn 


iv.  A  right  3-cascade 


Figure  3.3:  The  various  types  of  rotations. 
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The  following  table  concretizes  this  definition: 


i 

0 

1 

2 

3 

4 

AAj) 

2; 

2J 

,.M. 

Q,(n) 

n/2\ 

logn 

<  log*  n 

<  log"  n 

<  log—  n 

{n\Q{n)  =  i} 

[1.2] 

3,4 

[5, 16 

17,22      / 

3.1.3  Previous  Works 

Previous  works  on  the  Dynamic  Optimality  Conjecture  have  been  mostly  directed  to- 
wards resolving  its  corollaries.  Tarjan  [33]  proved  that  Splay  requires  linear  time  to 
sequentially  scan  the  nodes  of  an  n-node  binary  tree  in  symmetric  order.  This  theorem, 
called  the  Scanning  Theorem,  is  a  corollary  of  all  of  the  above  conjectures.  He  also  ex- 
tended his  proof  to  a  proof  of  the  Deque  Conjecture  when  all  the  output  operations  are 
performed  at  one  end  of  the  tree.  Lucas  [22]  obtained  an  0{na{n))  upper  bound  for  the 
Deque  Conjecture  when  all  the  operations  are  output  operations  and  the  initial  tree  is  a 
simple  path  between  the  leftmost  and  rightmost  nodes.  Building  upon  the  work  of  Cole 
et  al.  [11],  Cole  [9,10]  recently  proved  Sleator  and  Tarjan's  Dynamic  Finger  Conjecture 
[28]  for  the  Splay  Algorithm  which  is  a  corollary  of  the  Dynamic  Optimality  Conjecture. 
Wilber  [36]  gave  two  elegant  techniques  for  lower-bounding  x(s).  The  techniques  yield 
optimal  lower  bounds  for  some  special  sequences  (such  as  \(s)  =  fi(nlogn)  for  the  bit- 
reversaJ  permutation),  but  it  is  not  clear  how  tight  these  lower  bounds  are  for  general 
sequences. 

A  related  combinatorial  question  that  has  been  studied  is,  how  many  single  rotations 
are  needed,  in  the  worst  case,  to  transform  one  n-node  binary  tree  into  another  n-node 
binary  tree?  Culik  and  Wood  [12]  noted  that  2n  -  2  rotations  suffice  and,  later,  Sleator 
et  al.  [29]  derived  the  optimal  bound  of  2n  -  6  rotations  for  all  sufficiently  large  n. 

3.1.4  Our  Results 


Our  work  is  directed  towards  resolving  the  Deque  Conjecture.  A  good  understanding  of 
the  powers  of  various  types  of  rotations  on  binary  trees  would  equip  us  with  the  necessary 
tools  to  tackle  the  conjecture.  We  prove  almost  tight  upper  and  lower  bounds  on  the 
maximum  numbers  of  occurrences  of  various  types  of  right  rotations  in  a  sequence  of 
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right  rotations  performed  on  a  binary  tree.  We  study  the  following  types  of  rotations 
(See  Figure  3.3): 

Right  twist:   For  all  it  >    1,  a  right  A:-twist  is  a  sequence  of  k  right  single  rotations 
performed  along  a  left  subpath  of  a  binary  tree,  traversing  the  subpath  top-down. 

Right  turn:   For  all  /t  >  1,  a  right  /c-turn  is  a  right  A:-twist  that  converts  a  left  subpath 
of  k  edges  in  a  binary  tree  into  a  right  subpath. 

Right  cascade:   For  all  A;  >  1,  a  right  A:-cascade  is  a  right  A;-twist  that  rotates  every 
other  edge  lying  on  a  left  subpath  of  2^-  -  1  edges  in  a  binary  tree. 

A  right  twist  sequence  is  a  sequence  of  right  twists  performed  on  a  binary  tree.  Define 
Twk(n),  Tuk{n)  and  Ck(n),  respectively,  to  be  the  maximum  numbers  of  occurrences 
of  Ar-twists,  /j-turns  and  /.--cascades  in  a  right  twist  sequence  performed  on  an  n-node 
binary  tree.  These  numbers  are  well  defined  since  a  tree  is  transformed  into  a  right  path 


after 


right  single  rotations.   We  derive  the  following  bounds  for  Twk{n),  Tuk(n] 


and  Ckin): 


Upper  bound 

Lower  bound 

Twkin) 

0{kn'+'/') 

Q(n^  +  ^/^)-0{n) 

Tuk(n) 
Ckin) 

\  O(nloglogn)        if  )t  =  3 

r  n(na^f,^^^{n))-0(n)    d  k  ^  3 
\   fi(nloglogTi)                     if /c  =  3 

The  bounds  for  Tuk{n)  and  Cjt(n)  are  tight  d  k  <  2a(n)-5  and  the  bounds  for  Twk(n) 
are  nearly  tight.  The  Right  Turn  Conjecture  is  refuted  by  the  lower  bound  of  J7(nlog7i) 
for  Tu2{n)'^.  We  apply  the  upper  bound  for  cascades  to  derive  an  0((m  +  n)a(m  -f  n)) 
upper  bound  for  the  Deque  Conjecture. 

Another  approach  to  the  Deque  Conjecture  is  to  find  new  proofs  of  the  Scanning 
Theorem  that  might  naturally  extend  to  the  Deque  Conjecture  setting.  We  obtain  a 
simple  potential-based  proof  that  solves  Tarjan's  problem  [33]  of  finding  a  potential- 
based  proof  of  the  theorem,  and  an  inductive  proof  that  generalizes  the  theorem.  The 
new  proofs  enhance  our  understanding  of  the  Scanning  Theorem,  but,  so  far,  have  not 
led  to  a  proof  of  the  Deque  Conjecture. 


^S.R.Kosaraju  has  independently  proved  that  Tu2(n)  =  S(nlogn).    While  his  upper  bound  proof 
differs  from  ours,  the  lower  bound  constructions  match. 
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The  chapter  is  organized  as  follows.  In  Section  3.2,  we  prove  the  bounds  for  Twk{n), 
Tuk{n)  and  Ck(n).  In  Section  3.3,  we  derive  the  upper  bound  for  the  Deque  Conjecture. 
In  Section  3.4,  we  describe  the  new  proofs  of  the  Scanning  Theorem. 

3.2      Counting  Twists,  Turns,  and  Cascades 

The  two  subsections  of  this  section  derive  the  upper  and  lower  bounds  for  Twk(ri), 
Tuk{n)  and  Cfc(n). 

3.2.1      Upper  Bounds 

All  our  upper  bound  proofs  are  based  on  a  recursive  divide-and-conquer  strategy  that 
partitions  the  binary  tree  on  which  the  right  twist  sequence  is  performed  into  a  collection 
of  vertex-disjoint  subtrees,  called  block  trees.  The  root  and  some  other  nodes  within  each 
block  are  labeled  global  and  the  global  nodes  of  all  of  the  block  trees  induce  a  new  tree 
called  the  global  tree.  Each  rotation  on  the  original  tree  effects  a  similar  rotation  either 
on  one  of  the  block  trees  or  on  the  global  tree.  This  allows  us  to  inductively  count  the 
number  of  rotations  of  each  type  in  the  sequence. 

We  need  the  notion  of  blocks  in  binary  trees  [33].  Consider  an  n-node  binary  tree 
B  whose  nodes  are  labeled  from  1  to  n  in  symmetric  order.  A  block  of  B  is  an  interval 
[ij]  Q  [1."]  of  nodes  in  B.  Any  block  [i,j]  of  B  induces  a  binary  tree  5i[,,j],  called  the 
block  tree  of  block  [i,  j],  which  comprises  exactly  the  nodes  i  to  j.  The  root  of  B\[,^jj  is 
the  lowest  common  ancestor  of  nodes  i  and  j  in  B.  The  left  child  of  a  node  x  in  5|[,  ^j 
is  the  highest  node  in  the  left  subtree  of  i  in  5  which  lies  in  block  [i.j]-  The  right  child 
of  a  node  in  B\[,^j]  is  defined  analogously.  Notice  that,  for  the  subtree  rooted  at  any 
node  of  B,  the  highest  node  of  the  subtree  which  lies  in  block  [i,  j]  is  unique  whenever 
it  exists:  if  two  equally  highest  nodes  exist,  then  their  lowest  common  ancestor  in  the 
subtree  would  be  higher  than  the  two  nodes,  resulting  in  a  contradiction.  How  does  a 
rotation  on  B  affect  a  block  tree  5|[,,j]?  If  both  of  the  nodes  involved  in  the  rotation  are 
in  5|[,_j],  then  the  rotation  translates  into  a  rotation  on  B|[,j]  involving  the  same  pair 
of  nodes.  Otherwise,  5|[,,j]  is  not  affected. 

The  functions  Twk,  Tuk  and  Ck  are  superadditive: 
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Lemma  3.1    For  all  k  >  \  and  m  >  n  >  I,  we  have: 

a.  [m/n\Twk(n)  <Twk{m), 

b.  \_m/n\Tut;(n)  <  Tukim),  and 

c.  lm/njCk(n)  <  Ck(m). 

Proof.  We  prove  Part  a.;  Parts  b.  and  c.  are  similar.  Given  a  right  twist  sequence 
5  for  an  n-node  binary  tree  B  tliat  comprises  Twk(n)  right  A:- twists,  construct  a  new 
tree  of  size  Im/njn  <  m  by  starting  with  a  copy  of  B  and  successively  inserting  a  new 
copy  of  B  as  the  right  subtree  of  the  rightmost  node  in  the  current  tree  [m/nj  -  1  times. 
Since  5  can  be  performed  on  each  of  the  copies  of  B  one  after  another,  there  exists  a 
right  twist  sequence  with  [m/njTiVki^)  A:-turns  for  a  tree  of  size  m.  Part  a.  follows 
immediately.    D 

The  upper  bound  for  twists  is  the  simplest  to  derive.  Define  L,{j)  =  I  for 

all  i  >  1  and  j  >  1.  The  upper  bound  for  Twk(n)  for  n  of  the  form  Lk{j)  is  given  by: 

Lemma  3.2  Twk{Lk(j))  <  ki     I;!  7  ^  J  for  all  k  >  1  and  j  >  1. 

Proof.  We  use  double  induction  on  k  and  j. 

Case  1.  A:  =  1  or  j  =  1:  Straightforward. 

Case  2.  fc  >  2  and  j  >  2:  The  tree  is  partitioned  into  a  left  block  of  Lk-iij)  nodes 
and  a  right  block  of  Lk{j  -  1)  nodes.  A  right  twist  sequence  on  the  tree  translates  into 
corresponding  right  twist  sequences  on  the  left  and  right  block  trees.  We  classify  the 
Ar-twists  in  the  original  sequence  into  three  categories  and  count  the  number  of  /:-twists 
of  each  type  separately.  In  the  first  type  of  fc-twist,  the  lowest  k  -  1  single  rotations 
involve  only  left  block  nodes.   Such  a  /:-twist  translates  into  a  (fc  —  l)-twist  on  the  left 

block  tree.  Applying  induction  to  the  induced  right  twist  sequence  performed  on  the  left 

f  k  +  j  -  2\ 
block  tree,  we  see  that  there  are  at  most  (A:  -  1)1  it-twists  of  the  first  type 

in  the  original  right  twist  sequence.  Similarly,  the  number  of  fc-twists  that  involve  only 

right  block  nodes  is  at  most  ki  1.  Consider  a  )t-twist  that  does  not  belong  to 

these  two  categories.  The  highest  single  rotation  of  such  a  twist  must  Involve  only  right 
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block  nodes;  ailso,  the  lowest  node  involved  in  the  twist  must  be  a  left  block  node.  This 
implies  that  the  highest  node  of  the  twist  is  a  right  block  node  that  leaves  the  left  path 
of  its  block  as  a  result  of  the  twist.    Right  rotations  never  add  nodes  to  a  block's  left 

path,  so  the  number  of  /:-twists  in  the  last  category  is  at  most  the  initial  size  of  the  left 

f  k  +  j  -  2\ 
path  of  the  right  block  <  Lk(j  -  1)  =  ,  .It  follows  that  the  total  number  of 

right  A;-twists  in  the  right  twist  sequence  is  bounded  by 


A;         j         \     k+l 

k  +  j-l' 


=     k.      ,       , 
\     A:  +  1 


A  simple  calculation  using  the  above  lemma  and  Lemma  3.1a  gives  the  upper  bound 
for  Twk(n)  for  all  n: 

Theorem  3.1  Twk{n)  <  kn^+^^'=  for  all  k  >  \  and  n>\. 

(fc  +  A 
,       1 }.  Then  we  have 

Tk{n)    <     Tk{r\^\)l\{^l^\ln\     (By  Lemma  3.1a) 
-^''(jfctl)"/(T)     (ByLemma3.2) 

a 

We  derive  the  upper  bounds  for  turns  and  cascades.  It  is  easy  to  see  that  Tu\(n)  = 

Clin)  =  I  „  )  <  nao{n).   Let  us  prove  that  Tu2(n)  —  0(ndi(n)).   Consider  any  right 

twist  sequence  performed  on  a  binary  tree  B.  We  divide  B  into  a  left  block  [l,Ln/2j] 
and  a  right  block  [Ln/2j  +  l,n].  Every  2-turn  either  involves  nodes  from  only  one  block 
(intrablock)  or  involves  nodes  from  both  blocks  (interblock).  An  intrablock  2-turn  effects 
a  2-turn  in  the  corresponding  block  tree  and  gets  counted  in  the  right  twist  sequence  for 
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the  block  tree.  Every  interblock  2-turn  either  adds  a  node  to  the  right  path  of  the  left 
block  tree  or  deletes  a  node  from  the  left  path  of  the  right  block  tree  (See  Figure  3.4). 
Right  rotations  never  remove  nodes  from  a  block's  right  path  or  add  nodes  to  a  block's 
left  path,  so  the  number  of  interblock  2-turns  is  at  most  n-2.  This  leads  to  the  following 
recurrence  for  Tu^in): 


Tu2(n)  < 


Tu2{[n/2j)  +  Tu2i\n/-2])  +  n  -2     if  n  >  3 


if  1  <  n  <  2 


Solving  the  recurrence  yields  the  desired  bound  for  Tujin): 

Tu2(n)  <  nHognl  -  2l^°S"1  -  ^)  -h  2  <  nai(n). 


With  a  slight  modification  the  same  proof  works  for  2-cascades  also.  An  interblock 
2-cascade  either  decreases  the  size  of  the  left  path  of  the  right  block  tree  or  increases 
the  number  of  left  block  nodes  whose  left  depth  relative  to  the  block  is  at  most  1  (See 
Figure  3.5).  Right  rotations  never  increase  the  left  depth  of  a  node,  so  the  number  of 
interblock  2-cascades  is  at  most  n  -  3.  The  bound  C2(n)  <  ndi{n)  follows. 


Figure  3.4:  The  two  types  of  interblock  right  2-turns.  Circles  denote  left  block  nodes  and 
squares  denote  right  block  nodes.  The  stars  identify  the  nodes  that  lie  on  the  left/right 
path  of  a  block. 


3.2.    COUNTING  TWISTS,  TURNS,  AND  CASCADES 


47 


f 


/ 


#. 


Figure  3.5:  The  three  types  of  interblock  right  2-cascades.  The  sharps  identify  the  left 
block  nodes  that  have  a  left  depth  of  at  most  1;  the  stars  identify  the  right  block  nodes 
lying  on  the  left  path  of  their  block. 
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In  order  to  extend  the  above  argument  to  A-turns  and  /r-cascades  for  k  >  3,  we  need 
an  Ackerman-like  hierarchy  of  functions  {K\\i  >  1}: 

A'i(;)     =     Sj     for  all  ;  >  1       . 

A'2(j)     =     2''-'     for  all  J  >  1 

J.,..     _     f   iK\-2{[i/2\)  if  i>3and  J  =  1 

'''*-^'     ~     \   A-,(j-l)A",-2(A-,(j-l)/4)/2    ifj>3andj>2 

The  function  A',  grows  faster  than  the  Ackerman  function  '4i    /21: 

Lemma  3.3        1.  a[^\j)  <  K^ij)  for  all  j  >  1. 
^-   ^|i/2K-?^  -  -^^"'-^^  ^°^  °"  i  yi2  and  j  >l.    D 
The  upper  bound  for  Tu^in)  for  n  of  the  form  A\-(j)  is  given  by: 
Lemma  3.4  Tuk{Kk(j))  <  4jA'<..(j),  for  all  k  >  I  and  j  >  1. 

Proof.  We  use  double  induction  on  k  and  j. 

Case  1.  I  <  k  <  2:  The  lemma  follows  from  the  bounds  Tui(n)  <  nao(n)  and 
Tu2(n)  <  nai{n). 

Case  2.  /c  >  3  and  j  =  1:  We  need  to  show  that  Tukih'kC^))  <  4A'a:(1).  Consider  a 
binary  tree  B  having  A't(l)  nodes  on  which  a  right  twist  sequence  is  performed.  Divide 
B  into  a  sequence  of  A'jt_2([fc/2j  )/2  blocks  of  size  2k  each.  Each  fc-turn  is  of  one  of  the 
following  types: 

Type  A.  All  of  the  nodes  involved  in  the  /c-turn  belong  to  a  single  block:  Since  a 
block  has  only  2k  nodes,  there  can  be  at  most  one  such  fc-turn  per  block. 

Type  B.  Some  two  nodes  of  the  /:-turn  belong  to  a  single  block,  but  not  all  of  the 
nodes  of  the  turn  are  in  that  block:  Let  C  denote  the  block  tree  of  this  block.  The  A:-turn 
causes  either  an  increase  in  the  size  of  the  right  path  of  C,  or  a  decrea.se  in  the  size  of 
the  left  path  of  C,  or  both.  Hence  the  number  of  Type-B  /:-turns  is  at  most  2A',t(l)- 

Type  C.  Each  node  of  the  fc-turn  belongs  to  a  different  block:  To  handle  this  case, 
we  label  the  root  of  each  block  global.  The  global  nodes  in  B  induce  a  binary  tree  G, 
called  the  global  tree.   The  root  of  G  is  identical  to  the  root  of  B.   The  left  child  of  a 
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node  I  in  G  is  the  highest  global  node  in  the  left  subtree  of  x  in  B.  The  right  child  of 
a  node  is  defined  similarly.  It  is  easy  to  see  that  the  left  and  right  children  of  any  node 
in  G  are  unique.  The  effect  on  G  of  a  rotation  on  B  is  analogous  to  the  effect  of  such  a 
rotation  on  a  block  tree  of  B:  A  rotation  on  B  translates  into  a  rotation  on  G  if  both  of 
the  nodes  of  the  rotation  are  global;  otherwise,  G  is  unaffected.  (If  a  rotation  changes 
the  root  of  a  block  then  the  global  role  passes  from  the  old  root  to  the  new  root  but  this 
does  not  affect  the  global  tree.) 

Suppose  that  the  A;-turn  turns  the  left  subpath  xi  -  X2  -  ■  ■  ■  -  Xk+i  of  B  into  a 

right  subpath.  Since  all  the  x,s  are  from  different  blocks,  the  nodes  121-^3. ^k  are  all 

global.  Therefore,  the  /c-turn  results  in  a  (A:  -  2)-turn  on  G  (if  Xi  or  x^+i  is  also  global, 
then  some  right  single  rotations  are  also  performed  on  G.)  The  number  oi  {k  -  2)-turns 
that  can  be  performed  on  G  is  at  most 

Tuk-2(Kk-2(lk/2\)/2)     <     Tuk-2(Kk-2([k/2}))/2    (By  Lemma  3.1b) 

<  2[A;/2j/u_2(LA:/2j)    (By  the  induction  hypothesis) 

<  /u(l). 

This  gives  an  upper  bound  of  A'(t(l)  for  the  number  of  Type-C  fc-turns  performed  on  B. 
Summing  together  the  above  bounds  for  the  three  types  of  A;-turns,  we  obtciin  a  bound 
of 

AV2(L^/2j)/2  +  2A-fc(l)  +  Kk{l)  <  4/u(l) 

for  the  total  number  of  A;-turns  in  the  right  twist  sequence.  This  completes  Case  2. 

Case  3.  A:  >  3  and  j  >  2:  We  divide  the  binary  tree  on  which  the  right  twist 
sequence  is  executed  into  Kk{j)IKk(j  -  1)  blocks  of  size  Kk{j  -  1)  each.  We  split  the 
it-turns  into  the  three  types  defined  in  Case  2  and  obtain  the  following  tally  for  each 
type  of  turn: 

Number  of  Type-A  A:-turns 

<     ''^■^'     .4(j  -  \)Kk{i  -  1)    (By  the  induction  hypothesis) 

A'fc(j-  1) 

<  A{j-\)Kk{j). 

Number  of  Type-B  A:-turns  <  2Kk{j). 
Number  of  Type-C  fc-turns 
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<  ru^_2(/u_2(/u-(j-l)/4)/2) 

<  Tuk-2{h'k-2(I^'k{j  -  l)/4))/2    (By  Lemma  3.1b) 

<  h'kij  -  l)A't_2(  A'^(J  -  l)/4)/2    (By  the  induction  hypothesis) 
=  /U(j). 

Hence  the  total  number  of  ^--turns  in  the  sequence  is  at  most 

4(j  -  l)/u.(j)  +  2/u(j)  +  KkU)  <  ^jh'kU). 

This  finishes  Case  3.    D 

Combining  the  above  lemma  with  Lemmas  3.1b  and  3.3,  we  obtain  the  upper  bound 
for  Tuk(n)  for  all  k  and  n: 

Theorem  3.2 


[  Sn  log  log  n        if  k  =  3 


Snloglogn         if  k  =  3    D 

The  upper  bound  for  cascades  is  derived  analogously: 
Theorem  3.3 

[^  Snloglogn         if  k  =  3 

Proof.  It  suffices  to  prove  Lemma  3.4  for  Ck(n):  Ck{Kk{j))  <  4jKk{j),  for  all  k  >  I 
and  j  >  1.  Referring  to  the  proof  of  Lemma  3.4,  only  the  handling  of  Cases  2  and  3 
has  to  be  modified.  Consider  Case  2.  As  before,  the  blocks  have  size  2k  each.  The 
fc-cascades  are  categorized  as  follows: 

Type  A.  All  nodes  involved  in  the  cascade  belong  to  a  single  block:  There  is  at  most 
one  Type-A  cascade  per  block. 

Type  B.  One  of  the  cascade  rotations  involves  a  pair  of  nodes  belonging  to  a  single 
block,  but  not  all  of  the  nodes  of  the  cascade  are  in  that  block:  If  the  cascade  rotates  an 
edge  that  lies  on  the  left  path  of  some  block,  then  the  length  of  the  left  path  of  the  block 
decreases  by  at  least  1.  Alternately,  if  the  lowest  three  nodes  involved  in  the  cascade  are 
from  the  same  block,  then  the  number  of  nodes  in  that  block  whose  left  depth  is  at  most 
1  increases.   We  conclude  that  the  number  of  Type-B  cascades  falling  under  the  above 
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categories  is  at  most  2A';t(l)  -  /u_2(L^72j)-  In  every  remaining  Type-B  /r-cascade.  only 
the  lowest  cascade  rotation  is  intrablock  and  the  lowest  three  nodes  do  not  belong  to 
the  same  block.  Each  such  cascade  behaves  like  a  Type-C  cascade  in  that  it  causes  a 
(k  -  ■2)-cascade  on  the  global  tree  (defined  below)  which  accounts  for  it. 

Type  C.  Each  cascade  rotation  involves  a  pair  of  nodes  belonging  to  different  blocks: 
In  this  case  for  each  block,  in  addition  to  the  root  of  the  block,  we  also  label  the  left  child 
of  the  root  within  the  block  global,  if  it  exists;  if  the  root  has  no  left  child,  then  the  right 
child  of  the  root  is  labeled  global.  Right  rotations  are  propagated  from  the  original  tree 
to  the  global  tree  as  described  in  Lemma  3.4  except  in  the  following  situation:  When  the 
edge  joining  the  root  and  its  left  child,  say  /,  in  a  block  is  rotated,  the  left  child  of  /.  say 
//,  now  becomes  global,  and  if  //  is  not  adjacent  to  /  in  B.  this  results  in  a  series  of  right 
single  rotations  on  the  global  tree  (See  Figure  3.6).  Under  this  definition  of  global 
tree,  the  (fc  -  2)  interior  rotations  performed  by  any  Type-C  A:-cascade  are  all  global. 
Hence,  each  Type-C  fc-cascade  translates  into  a  rigiit  {k  -  2)-cascade  and  a  sequence  of 
right  single  rotations  on  the  global  tree.  Therefore  the  total  number  of  ^--cascades  in  the 
sequence  is  at  most 

AV2(L^V2j)/2  +  2/u(l)-  A'fc_2(L^/2j)  +  a_2(/u-2(L^-/2j))  <  4A\(1). 

This  completes  Case  2  in  the  proof  of  Lemma  3.4  for  cascades.  Case  3  is  handled  similarly. 
D 

3.2.2     Lower  Bounds 

The  lower  bound  right  twist  sequences  are  inductively  constructed  by  mimicking  the 
divide-and-conquer  strategy  used  to  derive  the  upper  bounds.  The  lower  bound  sequences 
always  transform  a  left  path  tree  into  a  right  path  tree.  The  tree  is  partitioned  into  a 
collection  of  vertex-disjoint  block  trees  and  a  global  tree  is  formed  by  selecting  nodes 
from  each  block  tree.  The  lower  bound  sequence  for  a  tree  is  constructed  by  inductively 
constructing  similar  lower  bound  sequences  for  the  block  trees  and  for  the  global  tree  and 
weaving  these  sequences  together.  Actually,  we  first  inductively  construct  a  sequence  of 
right  twists  as  well  as  deletions  having  sufficiently  many  rotations  of  the  given  type  and 
then  remove  the  deletions  to  obtain  the  lower  bound  sequence. 

We  need  some  definitions.  For  all  positive  integers  k,  a  nght  k -twist-deletion  sequence 
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The  original  tree 


The  global  tree 


Figure  3.6:  The  effect  of  a  right  single  rotation  involving  the  root  of  a  block  and  its  left 
child  within  the  block  on  the  global  tree.  Circles  denote  the  nodes  of  the  block;  other 
symbols  denote  the  nodes  from  other  blocks.  The  starred  nodes  in  the  original  tree  are 
the  global  nodes. 
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is  defined  to  be  an  intermixed  sequence  of  right  single  rotations,  right  ^--twists  and 
deletions  of  the  leftmost  node  performed  on  a  binary  tree.  Right  k -turn-deletion  sequences 
and  right  k -cascade-deletion  sequences  are  defined  analogously.  Consider  a  right  twist 
that  is  performed  on  some  left  subpath  iq  -  ^i  -  •  •  •  -  ^;  of  a  binary  tree,  where  xq  is 
the  lowest  node  on  the  subpath.  xq  is  called  the  base  of  the  twist.  If  x^  is  the  left  child 
of  a  node  y  (say)  in  the  tree,  then  the  twist  is  called  an  apex  twist  and  y  is  the  apex  of 
the  twist.  Otherwise,  the  twist  is  called  apexless. 

The  lower  bound  for  Twk{n)  for  n  of  the  form  Lkij)  =1  ,  1  is  given  by: 

Lemma  3.5  Twk(Lk(j))  >  I     t^T      j  for  all  k  >  I  and  j  >  \. 

Proof.  For  any  pair  of  positive  integers  k  and  j,  we  inductively  construct  a  right  k- 
twist-deletion  sequence  for  a  left  path  tree  of  Lk{j)  nodes  having  the  following  properties: 

1.  The  sequence  deletes  all  the  nodes  from  the  tree. 

2.  A  right  /i:-twist  always  involves  the  leftmost  node  of  the  tree. 

3.  A  deletion  always  deletes  the  root  of  the  tree. 

4.  The  sequence  has  exactly  1      .       i         A;-twists. 

The  removal  of  the  deletions  from  the  sequence  would  yield  a  right  twist  sequence  having 
the  desired  number  of  A;-twists. 

Case  1.  A;  =  1  or  7  =  1:  Easy. 

Case  2.  it  >  2  and  j  >  2:  Divide  the  left  path  tree  into  a  lower  block  of  size  Lk-i{j) 
and  an  upper  block  of  size  Lk{j  -  1).  Recursively  perform  a  right  (k  -  l)-twist-deletion 
sequence,  say  S,  on  the  lower  block  tree.  For  each  {k  -  l)-twist  in  5,  first  rotate  the  edge 
joining  the  root  of  the  lower  block  with  its  parent  and  then  perform  the  {k  -  l)-twist  on 
the  block.  This  is  equivalent  to  a  /:-twist  involving  the  leftmost  node  of  the  tree.  Each 
deletion  in  S  is  modified  by  first  making  the  deleted  node  the  root  of  the  tree  using  right 
rotations  and  then  deleting  the  node.  By  property  4  of  S,  the  number  of  (k  -  l)-twists 

in  5  is  exactly   |         ■'  ~      j .    The  initial  depth  of  the  root  of  the  lower  block  equals 
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Lkij  -  1)  =  I  .  ).  Since  each  (A-  -  l)-t\vist  in  5  reduces  the  depth  of  the  root  of 

the  lower  block  by  1  and  no  other  operation  in  5  affects  the  depth,  it  is  always  possible  to 
rotate  the  root  of  the  lower  block  just  before  the  execution  of  any  (k  -  I  )-twist  in  5.  The 
construction  is  completed  by  recursively  performing  a  right  A;-twist-deletion  sequence, 
say  5,  on  the  upper  block. 

The  sequence  obviously  satisfies  properties  1-3.    The  total  number  of  A'-twists  per- 
formed by  the  sequence  equals 

(the  number  of  (k  -  l)-twists  in  S)  +  (the  number  of  A:-twists  in  5) 

^"  +  J  -  2  \       ( k  +  J  -2\     ,^      .     .    ,  ,  ,     .  , 

+  (d.V  the  induction  hypothesis) 

k  +  j  -I 
k+1 

This  proves  property  4.    D 

Combining  Lemma  3.1a  with  the  above  lemma  yields: 

Theorem  3.4  Twk(n)  >  n^  +  ^/^/2e  -  Oin)  for  all  k  >  1  and  n  >  I.    D 

We  construct  the  lower  bound  sequences  for  turns.  As  in  the  upper  bound  proof,  we 
need  a  new  Ackerman-like  hierarchy  of  functions.  Define: 

BiU)     =    J    for  all  J  >  1 
B2U)    =     2^-1    for  all  J  >  1 

1  if  f  >  3  and  j  =  1 


^'^•'^  '    (('+l)j5.(j-l)+l)B,_2((f+l)jB.0-l))     ift>3andj>2 

The  function  B,  grows  essentially  at  the  same  rate  as  the  Ackerman  function  ^Ij72|- 

Lemma  3.6        1.  B^U)  <  /lp'(2j),  for  all  j  >  1. 
2.   B,{j)  <  A^-j2^{3j)  for  all  i:^3andj>l.    D 

The  lower  bound  for  Tuk{n)  for  n  of  the  form  Bk{j)  is  given  by: 

Lemma  3.7  Tuk{Bk{j))  >  (l/2)(j  -  3)Bk{j)  for  all  k  >  I  and  j  >  1. 
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Proof.  For  any  pair  of  positive  integers  k  and  j,  we  inductively  construct  a  right 
A;-turn-deletion  sequence  for  the  left  path  tree  of  B^ij)  nodes  having  the  following  prop- 
erties: 

1.  The  sequence  deletes  all  the  nodes  from  the  tree. 

2.  A  right  A:-turn  always  involves  the  leftmost  node  of  the  tree. 

3.  A  deletion  always  deletes  the  root  of  the  tree. 

4.  The  sequence  comprises  at  least  (l/2)(j  -  3)Bk(j)  apex  fc-turns.  Further,  if  k  >  3, 
there  are  no  apexless  A;-turns  in  the  sequence. 

5.  For  any  node  x,  the  number  of  apex  ^--turns  with  base  x  is  at  most  j. 

6.  For  any  node  x,  the  number  of  apex  A;-turns  with  apex  x  is  at  most  j. 

Case  1.  k  =  1:  The  sequence  repeatedly  rotates  the  leftmost  node  to  the  root  and 
deletes  it. 

Case  2.  fc  =  2:  Divide  the  left  path  tree  into  a  lower  left  subpath  comprising  2-'"^  -  1 
nodes,  a  middle  node,  and  an  upper  left  subpath  comprising  2-'"'  -  1  nodes.  Recursively 
perform  a  right  2-turn-deletion  sequence  on  the  lower  subpath.  Modify  each  deletion  in 
this  sequence  as  follows:  Perform  a  2-turn  on  the  subpath  defined  by  the  deleted  node, 
say  X,  its  parent  (the  middle  node),  and  its  grand  parent;  make  x  the  root  of  the  tree 
by  successively  rotating  the  edge  joining  it  and  its  parent;  delete  x  from  the  tree  (this 
also  deletes  x  from  the  lower  subpath.)  Next,  delete  the  middle  node  which  is  currently 
the  root  of  the  tree.  Finally,  recursively  perform  a  right  2-turn-deletion  sequence  on  the 
upper  subpath  (See  Figure  3.7). 

This  sequence  performs  {j  -  2)2-'-*  -I-  1  2-turns  of  which  exactly  j  -  1  are  apexless. 
Therefore  the  number  of  apex  2-turns  is  at  least  (j  -  3)2-'-*  >  (l/2)(j  -  3)B2(j).  This 
proves  property  4.  The  remaining  properties  are  easy  to  check. 

Case  3.  fc  >  3  and  j  =  1:  Just  delete  the  only  node  in  the  tree. 

Case  4.  fc  >  3  and  j  >  2:  Let  s  -  (k  +  l)jBk{j  -  1).  We  inductively  construct  the 
sequences  of  operations  performed  on  the  block  trees  of  the  tree: 
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Figure  3.7:  The  lower  bound  construction  for  right  2  turns.  The  construction  recursively 
transforms  a  left  path  tree  of  size  2-'  -  1  into  a  right  path  tree. 
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Lemma  3.8    There  exists  a  right  k -turn-deletion  sequence  for  a  left  path  tree  of  size  5  +  1 
satisfying  properties  1  and  2  and  the  following  properties: 

3.  A  deletion  that  is  not  the  last  operation  in  the  sequence  always  deletes  the  left  child 
of  the  root. 

4.  The  sequence  comprises  at  least  (l/2)(j  -  4)s  +  j  right  k-turns  all  of  which  are  apex 
turns. 

5.  For  any  node  x,  the  number  of  apex  k-turns  with  base  x  is  at  most  j  —  l- 

6.  For  any  node  x,  the  number  of  apex  k-turns  with  apex  x  is  at  most  j  —  1. 

7.  The  root  of  the  tree  is  always  the  rightmost  node. 

Proof.  Divide  the  nodes  of  the  tree  excluding  the  root  into  a  sequence  of  (k  +  l)j 
blocks  of  size  Bk(j  -  1)  each.  Perform  a  right  fc-turn-deletion  sequence  obeying  properties 
1-6  on  the  lowest  block.  (The  inductive  hypothesis  implies  the  existence  of  such  a 
sequence.)  Denote  this  sequence  by  5.  Each  deletion  in  S  except  the  last  is  modified  by 
rotating  the  deleted  node  up  the  tree  until  it  is  adjacent  to  the  root  and  then  deleting 
it.  The  deletion  of  the  last  node  in  the  block,  say  ii,  is  implemented  diiferently.  xi  is 
rotated  up  the  tree  until  it  is  in  contact  with  the  root  of  the  next  higher  block,  say  12- 
12  is  rotated  upwards  in  a  similar  fashion  in  order  to  make  it  adjacent  to  the  root  of  the 
next  higher  block,  say  13.  In  this  manner  we  create  a  left  subpath  xi  -  12  -  ■  •  •  -  Xk+i 
containing  the  roots  of  the  lowest  A;  +  1  blocks.  Next,  a  right  A;-turn  is  performed  on  this 
subpath  and  then  ii  is  rotated  up  the  tree  and  deleted.  Following  this,  5  is  executed 
on  the  blocks  of  nodes  X2,X3. .  .,Xk+i  in  succession.  Each  deletion  is  modified  by  first 
making  the  deleted  node  adjacent  to  the  root  and  then  deleting  it.  At  the  conclusion  of 
this  sequence  of  operations,  all  the  nodes  in  the  lowest  k  +  l  blocks  of  the  tree  have  been 
deleted  and  at  least  (l/2)(j  -  4){k  +  1)5^0  -  1)  +  1  apex  *:-turns  have  been  performed. 
The  above  sequence  of  operations  is  repeated  on  each  group  of  A:  +  1  consecutive  blocks, 
choosing  the  lowest  group  of  blocks  currently  in  the  tree  each  time.  The  final  operation 
of  the  sequence  deletes  the  root. 

It  is  obvious  that  the  right  fc-turn-deletion  sequence  constructed  above  satisfies  prop- 
erties 1,2,3  and  7.   Since  there  are  j  groups  of  A;  +  1  blocks  each,  the  total  number  of 
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apex  /c-turns  executed  by  the  sequence  is  at  least  ( l/2)(j  -4)s  +  j.  Further,  by  property 
4  of  S.  the  sequence  performs  only  apex  /c-turns.  This  proves  property  4.  Properties  5 
and  6  are  easy  to  show  using  properties  5  and  6  of  sequence  S.    D 

We  construct  a  right  /c-turn-deletion  sequence  for  the  left  path  tree  of  size  Bk(j) 
satisfying  the  six  properties. _  The  tree  is  partitioned  into  5a,_2(s)  blocks  of  size  s  +  I 
each.  The  root  of  each  block  is  labeled  global.  The  global  nodes  form  a  global  tree  as 
described  in  the  proof  of  Lemma  3.4.  By  the  induction  hypothesis,  there  exists  a  right 
(k  -  2)-turn-deletion  sequence,  say  S,  for  the  global  tree,  satisfying  properties  1-6.  We 
construct  the  right  A:-turn-deletion  sequence,  denoted  5,  for  the  original  tree  by  mapping 
each  global  tree  operation  in  5  onto  a  sequence  of  original  tree  operations,  preserving  the 
correspondence  between  the  two  trees.  The  following  invariants  define  the  relationships 
between  the  two  trees: 

A.  Let  B  denote  the  block  containing  the  leftmost  node  of  the  tree  and  let  x  denote 
the  root  of  B.  Suppose  that  d  nodes  have  been  deleted  from  B  so  far.  Then, 

i.  The  number  of  apex  (k  -  2)-turns  performed  so  far  on  the  global  tree  that 
had  X  as  their  base  is  exactly  d. 

ii.  Denote  by  5  the  right  /c-turn-deletion  sequence  constructed  by  Lemma  3.8 
that  deletes  all  the  nodes  from  the  left  path  tree  of  size  5  +  1.  Let  T  denote 
the  tree  that  results  when  the  prefix  of  S  up  to  the  d^^  deletion  is  executed 
on  the  left  path  tree.  The  block  tree  of  B  equals  T. 

B.  Consider  any  block  B  that  does  not  contain  the  leftmost  node  of  the  tree.  Let  x 
denote  the  root  of  B.  The  block  tree  of  5  is  a  left  path  tree  which  is  divided  into 
the  root  and  two  subpaths.  The  nodes  in  the  lower  subpath,  called  black  nodes, 
are  the  nodes  in  B  that  have  participated  in  a  /c-turn.  The  nodes  in  the  upper 
subpath  are  called  white  nodes. 

i.  If  b  denotes  the  number  of  black  nodes  currently  in  B,  then  exactly  b  of  the 
apex  (k  —  2)-turns  performed  so  far  on  the  global  tree  had  x  as  their  apex. 

C.  If  X  is  a  global  node  with  a  right  child  y  in  the  global  tree,  y  is  also  the  right  child 
of  X  in  the  original  tree. 
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D.  If  z  is  a  global  node  with  a  left  child  y  in  the  global  tree,  there  is  a  left  subpath 

X  =  xq  -  Xi  -  . . .  -  Xk+i  =  y  in  the  original  tree  such  that  Xi.x^ x^  are  the 

set  of  white  nodes  in  the  block  of  x. 

Each  global  tree  operation  in  5  is  simulated  as  follows: 

Right  rotation:  Suppose  that  a  global  tree  edge  [x,y],  such  that  y  is  a.  left  child  of  x, 
is  rotated.  In  the  original  tree  we  repeatedly  rotate  the  edge  connecting  y  and  its 
parent  until  x  becomes  the  right  child  of  y.  Only  invariants  C  and  D  are  affected 
by  the  rotations.  It  is  not  hard  to  see  that  both  these  invariants  are  true  after  the 
last  rotation. 

Deletion:  Suppose  that  a  global  node  x  is  deleted.  Since  x  is  the  root  of  the  global  tree, 
it  is  also  the  root  of  the  original  tree.  Let  d  denote  the  number  of  nodes  deleted  so 
far  from  the  block  of  x.  We  perform  S  (the  sequence  constructed  by  Lemma  3.8) 
on  the  block  tree  of  x  starting  immediately  after  the  d^^  deletion.  Each  deletion  is 
modified  so  that  the  deleted  node  is  first  made  the  root  of  the  tree  and  then  deleted. 
Invariant  A.ii  ensures  that  this  is  valid  and  that  this  will  result  in  the  deletion  of 
all  the  nodes  in  the  block  of  x  from  the  tree.  Therefore  this  sequence  of  operations 
reestablishes  the  correspondence  between  the  globed  tree  and  the  original  tree. 

Apexless  [k  -  2)-turn:  Break  up  the  turn  into  a  sequence  oi k-2  global  rotations  and 
simulate  each  global  rotation  as  specified  above. 

Apex  {k  -  2)-turn:  Suppose  that  a  global  tree  subpath  ii  -  12 Xk-\  is  turned 

and  that  xi  is  the  base  of  the  turn.  Let  xq  denote  the  leftmost  node  in  the  block  of 
ii  and  let  ijt  denote  the  parent  of  Xk-\  in  the  original  tree.  We  create  the  subpath 
Xq  -  xi  -  . ..  -  Xk  in  the  original  tree  and  perform  a  fc-turn  on  this  subpath.  This 
is  implemented  as  follows: 

1.  Let  d  denote  the  number  of  nodes  deleted  so  far  from  the  block  of  x\.  Exe- 
cute the  segment  of  sequence  5  between  the  d^'^  and  the  {d  +  if^  deletions 
(excluding  the  deletions)  on  the  block  tree  of  ii.  By  Lemma  3.8,  property  3, 
this  makes  node  xq  the  left  child  of  xi. 

2.  Rotate  ii  up  the  tree  until  its  parent  is  12-  Continuing  in  this  fashion,  rotate 
the  nodes  Xji  2:3, ... ,  Xfc_i  upwards,  creating  a  left  subpath  10-2:1 Xk- 
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3.  Perform  a  fc-turn  on  the  subpath  xq  -  xi  -  ■  ■  ■  -  Xk- 

4.  Rotate  xq  up  the  tree,  making  it  the  root,  and  delete  it. 

5.  Since  Xk  has  become  black  due  to  the  ^--turn.  the  edge  joining  x^  and  its  left 
child  is  repeatedly  rotated  until  the  left  child  of  x^  is  not  a  global  node. 

Invariant  B.i  and  property  6  of  5  guarantee  that  Xk  is  white  at  the  beginning  of  this 
sequence  of  operations.  Similarly,  invariant  A.i  and  property  b  of  5  guarantee  that 
xo  is  well  defined.  Observe  that  all  invariants  are  true  at  the  end  of  the  simulation. 

The  sequence  of  operations  performed  on  the  original  tree  during  the  simulation  of  S 
constitutes  sequence  5. 

5  deletes  all  the  nodes  in  the  original  tree  since  S  deletes  all  the  nodes  in  the  global 
tree.  This  proves  that  5  satisfies  property  1. 

Properties  2  and  3  of  5  are  apparent  from  the  simulation  procedure. 

By  Lemma  3.8,  at  least  (l/2)(j  -  4)5  +  j  apex  ^--turns  (local  turns)  are  performed 
during  the  execution  of  5  on  any  particular  block.  Hence  the  total  number  of  local  turns 
summed  over  all  blocks  is  at  least  {l/2){j -4)sBk-2(s)  +  jBk-2{s)-  The  number  of  turns 
involving  global  nodes  (global  turns)  equals  the  number  of  (k  -  2)-turns  in  5  which,  by 
the  induction  hypothesis,  is  at  least  (l/2){s  -  3)Bk-2{s).  Therefore  the  total  number  of 
A;-turns  in  5  is  at  least 

(l/2)(j  -  4)sBk-2{s)  +  jBk-2(s)  +  (l/2)(s  -  3)Bk-2{s) 
=     ((l/2)(j-S)s  +  j-(3/2))Bk-2(s) 
>     (l/2)(j  -  3)Bk(j). 

Evidently,  every  fc-turn  in  S  has  an  apex.  This  proves  property  4. 

For  any  node  x,  there  is  at  most  one  global  turn  with  i  as  the  base  since  i  is  deleted 
from  the  tree  immediately  after  the  turn.  By  Lemma  3.8,  there  are  at  most  j  -  1  local 
turns  with  x  as  the  base.  We  conclude  property  5.  Property  6  is  proved  analogously.    D 

Combining  the  above  lemma  with  Lemma  3.1b  yields: 

Theorem  3.5 

TM„)>|'""'"°W2j'">-°<">    "'^' 

{  ( l/8)n  log  log  n-O(n)  if  k  =  3    D 
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The  lower  bound  for  cascades  is  given  by: 
Theorem  3.6 

'''"^  ~   [   (l/8)nloglogn-0(n)  if  k  ^  i 

Proof.  We  modify  the  lower  bound  proof  for  Tuk(n)  given  above.  Define: 

B[ij)     =     j    for  all  ;  >  1 

B'iij)     =     3.2-'  -  2    for  all  j  >  1 

_     /   1  ifi>3andj  =  l 

^'^•'^     "     \('iijB[{j-l)  +  3)Bl_^{4ijB:{j-l))    if;>3and;>2 

It  is  easy  check  that  Lemma  3.6  holds  for  the  new  hierarchy  {5-}.  We  prove  the  analogue 
of  Lemma  3.7  for  Ck{n),  which  states  that  Ck(B'^^{j)]  >  (l/2)(j  -  3)B'kij)  for  all  fc  >  1 
and  j  >  1.  We  construct  a  right  fc-cascade-deletion  sequence  that  converts  a  left  path 
tree  of  size  B'^(j)  into  a  right  path  tree  and  satisfies  the  analogues  of  properties  1-6  for 
cascades. 

Case  1.  A:  =  1  or  j  =  1:  Easy. 

Case  2.  k  =  2:  Divide  the  left  path  tree  into  a  lower  subpath  and  an  upper 
subpath,  each  having  3.2''"^  -  2  nodes,  and  two  middle  nodes.  The  right  2-cascade- 
deletion  sequence  is  constructed  by  recursing  on  the  lower  and  upper  subpaths  in  turn 
and  performing  a  2-cascade  involving  the  deleted  node  and  the  middle  nodes  for  each 
deletion  in  the  first  recursive  step.  The  sequence  comprises  (3j  -  4)2'^"^  -  j  +  2  > 
(l/2)(3.2-'  -  2)0  -  3)  apex  2-cascades  and  satisfies  all  the  properties. 

Case  3.  ^  >  3  and  j  >  2:  Let  s  =  4kjB[(j  -  1 ).  .A  p,q-zigzag  tree  is  a  tree  that  is 
constructed  from  a  p-node  left  path  tree  by  inserting  a  9- node  left  path  tree  as  the  right 
subtree  of  the  leftmost  node.  We  extend  Lemma  3.H  to  cascades  and  construct  a  right 
ib-cascade-deletion  sequence,  say  S,  for  a  3,5-zigza(;  tree  that  comprises  (l/2)(>-4)5  +  2j 
apex  A;-cascades.  Each  deletion  in  the  sequence,  except  for  the  last  two  deletions,  deletes 
the  leftmost  grandchild  of  the  root.  The  proof  divides  the  tree  into  2j  groups  of  2k 
blocks  each,  each  block,  in  turn,  of  size  B[(j  -  1 ).  and  recursively  performs  a  right  k- 
cascade- deletion  sequence  on  each  block,  choosing  the  blocks  in  bottom-to-top  order.  A 
Jk-cascade  is  performed  on  the  roots  of  the  blocks  within  each  group,  yielding  2j  extra 
cascades. 
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The  tree  is  partitioned  into  B'f._2{s)  blocks  of  size  s  +  3  each.  The  global  tree  is 
constructed  from  the  roots  of  the  blocks  and  a  right  {k  -  2)-cascade-deletion  sequence, 
say  5,  satisfying  properties  1-6  is  recursively  performed  on  it.  The  simulation  of  S  on 
the  original  tree  maintains  the  invariants  A  (with  expression  .«  +  3  replacing  .s  +  1),  C 
and  D  and  the  following  modification  of  invariant  B: 

B.  Consider  any  block  B  that  does  not  contain  the  leftmost  node  of  the  tree.  Let  x 
denote  the  root  of  B.  Block  B  induces  a  connected  subtree  in  the  original  tree. 
Further,  if  exactly  h  of  the  {k  -  2)-cascades  performed  so  far  on  the  global  tree 
had  X  as  their  apex,  then  the  block  tree  of  5  is  a  (.s  -  6  +  3).6-zigzag  tree. 

The  simulation  of  global  tree  operations  other  than  apex  [k  -  2)-cascades  is  as  before. 
Consider  an  apex  (/c  — 2)-cascade  in  5  involving  a  global  tree  subpath  xx-X2--  ■  -  J2fc-4> 
such  that  Xi  is  the  base  of  the  cascade.  Let  j/i  and  Xj2  denote,  respectively,  the  leftmost 
node  in  the  block  of  ii  and  the  left  child  of  jj.  Let  zx  and  ^2  denote,  respectively, 
the  parent  and  the  grandparent  oi  X2k-A  in  the  original  tree.  The  global  tree  cascade  is 
simulated  on  the  originzd  tree  by  creating  the  subpath  t/i  -y2--ri  -2'2--  •  -^2^-4  --i  --2 
in  the  original  tree,  performing  a  A;-cascade  on  this  subpath,  and  finally  deleting  y\  from 
the  tree.  We  verify  that  the  resulting  sequence,  say  5,  satisfies  property  4: 

#  /:-cascades  in  5     =     #  local  cascades  +  #  global  cascades 

>  ((l/2)(j  -  4)5  +  2j)5L2(5)  +  (l/2)(5  -  3)5^21^) 

>  (1/2)0- -3)5^7). 

The  rest  of  the  properties  of  S  are  easy  to  check.  This  completes  the  proof  of  Lemma  3.7 
for  cascades.  The  theorem  follows.    D 

3.3     An  Upper  Bound  for  the  Deque  Conjecture 

In  this  section,  we  show  that  Splay  takes  0((m  +  Ti)Q(m  + n))  time  to  process  a  sequence 
of  m  deque  operations  on  an  n-node  binary  tree.  We  reduce  a  deque  operation  sequence 
to  right  cascade  sequences  on  auxiliary  trees  and  apply  the  upper  bounds  for  cascades. 

Define  the  cost  of  a  deque  operation  or  a  right  twist  operation  to  be  the  number  of 
single  rotations  performed  by  the  operation. 
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The  cost  of  a  set  of  right  cascades  in  a  right  twist  sequence  is  given  by: 

Lemma  3.9  Consider  an  arbitrary  right  twist  sequence  executed  on  an  n-node  binary 
tree.   The  total  cost  of  any  m  right  cascades  in  the  sequence  equals  0{[m-\-n)a{m-\-n.n)). 

Proof.  Let  /  =  2a{m  +  n,  n)  +  2.  Split  each  of  the  m  right  cascades  into  a  sequence 
of  right  /-cascades  followed  by  a  sequence  of  at  most  /  -  1  rotations.  By  Theorem  3.3, 
the  total  number  of  right  /-cascades  is  at  most  8TiQ|i;2|(n).  This  yields  a  bound  of 
(m  +  8nQ  1 1/2 1  (r))' for  the  number  of  rotations  performed  by  them  right  cascades.  We 
bound  <i|//2|('^)  "^^  follows: 

Aa(m+n.n)  +  \[\jTlln\  +  2)      =       Ac,(m-^n.n)( '^^(m-Hn.nj-H  (  L"^/"J  +  1)) 

>  Ai[A^t^rn  +  n.n){\.[Tn+n)ln\)) 

>  n. 

Therefore  Q|.;2|  (")  =  <io(m-i-n,n)4-i(")  <  [m/nj  +  2.  The  lemma  follows.    D 

Remark.  Hart  and  Sharir  [19]  proved  a  result  similar  to  Lemma  3.9  concerning 
sequences  of  certain  path  compression  operations  on  rooted  ordered  trees.  Their  result 
can  be  derived  from  the  analogue  of  Lemma  3.9  for  turns  by  interpreting  turns  in  a 
binary  tree  as  path  compressions  on  the  rooted  ordered  tree  representation  of  the  binary 
tree.  It  is  interesting  that  they  also  use  ideas  similar  to  blocks  and  global  tree  in  their 
proof. 

We  estimate  the  cost  of  a  sequence  of  deque  operations  performed  at  one  end  of  a 
binary  tree  that  also  has  left  and  right  path  rotations: 

Lemma  3.10  Consider  an  intermixed  sequence  of  Pops,  Pushs,  left  path  rotations  and 
right  path  rotations  performed  on  an  arbitrary  n-node  binary  tree.  The  total  cost  o/PoP 
operations  equals  0((m  +  n)Q(m  +  n)),  where  m  denotes  the  number  of  Pops  and  PusHs 
in  the  sequence. 

Proof.  We  simplify  the  sequence  through  a  series  of  transformations  without  under- 
counting  Pop  rotations. 

Simplification  3.1    The  first  operation  of  the  sequence  is  a  Pop. 
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Transformation.  Delete  the  operations  preceding  tlie  first  Pop  from  the  sequence 
and  modify  the  initial  tree  by  executing  the  deleted  prefix  of  the  sequence  on  it.    D 

Simplification  3.2    The  sequence  does  not  contain  Push  operations. 

Transformation.  For  each  Push  operation,  insert  a  node  into  the  initial  tree  as 
the  symmetric  order  successor  of  the  last  node  that  was  popped  before  the  Push.  The 
Push  operation  itself  is  implemented  by  just  rotating  its  corresponding  node  to  the  root 
through  right  rotations.    D 

Define  a  Partialpop  to  be  a  sequence  of  arbitrarily  many  right  2-turns  performed 
on  the  leftmost  node  of  a  binary  tree  followed  by  deletion  of  the  node. 

Simplification  3.3  The  sequence  consists  of  only  Partialpop^  and  left  path  rotations; 
the  lemma  is  true  if  the  total  cost  o/Partialpop  operations  equals  0((m  +  n)a(m  +  n)), 
where  m  denotes  the  number  o/ParTIALPOPs  in  the  sequence  and  n  denotes  the  size  of 
the  initial  tree. 

Transformation.  Normalize  the  tree  by  rotating  the  nodes  on  the  right  path  across 
the  root  into  the  left  path  and  consider  the  resulting  sequence.    D 

Simplification  3.4    The  sequence  comprises  only  right  cascades;  the  lemma  is  true  if 
the  total  cost  of  any  m  cascades  in  the  sequence  equals  0{{m  +  n)a{m  +  n)). 

Transformation.  Instead  of  deleting  nodes  at  the  end  of  Partialpops,  rotate  them 
upwards  to  the  right  path.    D 

The  lemma  follows  from  Simplification  4  and  Lemma  3.9.    D 

The  upper  bound  for  the  Deque  Conjecture  is  given  by: 

Theorem  3.7  The  cost  of  performing  an  intermixed  sequence  of  m  deque  operations  on 
an  arbitrary  n-node  binary  tree  using  splay  equals  0{{m  -\-  n)6i(Tn  +  n)). 

Proof.  Divide  the  sequence  of  operations  into  a  series  of  epochs  as  follows:  The 
first  epoch  comprises  the  first  max{Ln/2j,  1}  operations  in  the  sequence.  For  all  i  >  1, 
if  the  tree  contains  k  nodes  at  the  end  of  epoch  i,  then  epoch  t  +  1  consists  of  the 
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next  max{[A:/2j,  1}  operations  in  the  sequence.  The  last  epoch  might  consist  of  fewer 
operations  than  specified.  It  suffices  to  show  that  the  cost  of  an  epoch  that  starts  with 
a  fc-node  tree  is  0{kQ{k)).  since  the  sum  of  the  sizes  of  the  starting  trees  over  all  epochs 
is  0(777  +  n ). 

Consider  an  epoch  whose  initial  tree,  say  T,  has  k  >  2  nodes.  Divide  T  into  a  left 
block  of  [fc/2j  nodes  and  a  right  block  of  f^V'^l  nodes.  This  partitioning  ensures  that 
neither  block  gets  depleted  before  the  epoch  completes.  The  total  cost  of  PusHs  and 
Injects  is  0,  since  only  rotations  contribute  to  the  operation  cost.  We  show  that  the 
total  cost  of  Pops  is  0{kQ(k)).  The  same  proof  will  apply  to  Ejects. 

A  Pop  on  T  translates  into  a  Pop  on  the  left  block.  The  effect  of  a  Pop  on  the  right 
block  is  a  series  of  left  path  rotations.  It  is  easy  to  see  that  the  total  number  of  single 
rotations  performed  by  a  Pop  is  at  most 

(the  number  of  single  rotations  performed  by  the  PoP  on  the  left  block)  + 
2(the  number  of  left  path  rotations  performed  on  the  right  block)  +  2. 

A  Push  operation  on  the  tree  propagates  as  a  PUSH  on  the  left  block.  An  Eject 
performs  only  right  path  rotations  on  the  left  block.  An  Inject  does  not  affect  the  left 
block.  Hence  by  Lemma  3.10,  the  total  number  of  single  rotations  performed  by  all  the 
Pops  on  the  left  block  equals  0(2[)t/2jd(2[fc/2j))  =  0{kQ(k)).  A  left  path  rotation 
on  the  right  block  decreases  the  size  of  the  left  path  of  the  block  by  1.  The  initial  size 
of  this  path  is  at  most  Ik/T]  and  the  size  increases  by  at  most  1  per  deque  operation. 
Therefore  the  total  number  of  left  path  rotations  performed  on  the  right  block  due  to 
Pops  is  at  most  k  +  I.  This  leads  to  an  0{ka{k))  upper  bound  on  the  total  cost  of  all 
the  Pops  performed  during  the  epoch.    D 


3.4      New  Proofs  of  the  Scanning  Theorem 

In  the  two  subsections  of  this  section,  we  describe  a  simple  potential-based  proof  of  the 
Scanning  Theorem  and  an  inductive  proof  that  generalizes  the  theorem. 
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3.4.1      A  Potential-based  Proof 

The  proof  rests  on  the  observation  that  a  certain  subtree  of  the  binary  tree,  called  the 
kernel  tree,  which  is  mainly  involved  in  the  splay  operations  always  has  a  very  nice 
shape.  As  the  nodes  of  the  original  tree  are  accessed  using  splays,  the  kernel  tree  evolves 
through  insertions  and  deletions  of  nodes  at  the  left  end,  and  left  path  cascades  caused 
by  the  splays.  Each  node  of  the  kernel  tree  is  assigned  a  unimodal  potential  function, 
that  is,  a  potential  function  that  initiaUy  steadily  increases  to  a  ma.\imum  value  and 
then  steadily  decreases  once  the  node  has  progressed  sufficiently  through  the  kernel  tree. 
The  nice  shape  of  the  kernel  tree  guarantees  that  most  of  the  nodes  involved  in  each 
splay  are  in  their  potential  decrease  phase,  enabling  their  decrease  in  potentials  to  pay 
for  all  the  rotations  and  the  small  increase  in  potentials  of  the  nodes  in  their  potential 
increase  phase. 

We  need  some  definitions.  A  binary  tree  is  called  rightist  if  the  depths  of  the  leaves 
of  the  tree  increase  from  left  to  right.  The  left  and  right  heights  of  a  binary  tree  are 
defined,  respectively,  to  be  the  depths  of  the  leftmost  and  rightmost  nodes.  The  right 
inner  height  of  a  node  x  is  defined  to  be  the  depth  of  the  successor  of  x  within  I's  subtree 
if  X  has  a  right  subtree  and  0  otherwise. 

We  are  ready  to  describe  the  proof.  At  any  time  during  the  sequence  of  splays, 
the  set  of  nodes  in  the  current  tree  that  have  been  involved  in  a  splay  rotation  form  a 
connected  subtree  of  the  tree,  called  the  kernel  tree,  whose  root  coincides  with  the  root 
of  the  right  subtree  of  the  tree.  Initially,  the  kernel  tree  is  empty.  The  sequence  of  splays 
on  the  the  original  tree  propagates  into  an  intermixed  sequence  of  n  PuSHs  and  n  Pops 
on  the  kernel  tree,  where  a  Push  inserts  a  new  node  at  the  bottom  of  the  left  path  of 
the  tree  and  a  Pop  splays  at  and  deletes  the  leftmost  node  of  the  tree.  Our  goal  is  to 
show  that  the  cost  of  the  sequence  of  operations  on  the  kernel  tree  equals  0{n).  The 
theorem  would  then  follow  immediately. 

The  argument  focuses  on  the  sequence  of  operations  on  the  kernel  tree.  Since  the 
kernel  tree  is  created  by  a  sequence  of  Pi;sHs  and  Pops,  it  satisfies  the  following  two 
properties: 

1.  The  subtrees  hanging  from  the  left  and  right  paths  of  the  tree  are  rightist. 

2.  If  Ti  and  T2  are  subtrees  hanging  from  the  left  path  with  Ti  to  the  left  of  Tj,  then 
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rightheight(ri)  <  leftheighttTi)- 

This  can  be  easily  shown  using  induction. 

We  use  the  following  potential  function.  The  potential  of  the  kernel  tree  equals  the 
sum  of  the  potentials  of  all  its  nodes.  The  potential  of  a  node  consists  of  an  essential 
component  and  a  nonessential  component.  For  any  node  x,  let  ld(x)  and  rih(x)  denote, 
respectively,  the  left  depth  and  the  right  inner  height  of  x.  The  essential  potential  of  x 
equals  min{flog/(f(x)],ri/i(x)}  unless  x  is  on  the  right  path  in  which  case  its  essential 
potential  equals  0.  The  essential  potential  of  a  node  is  a  unimodal  function  of  time, 
since  the  potential  first  monotonely  increases  from  0  until  the  node's  right  inner  height 
overtakes  the  logarithm  of  its  left  depth  and  then  monotonely  decreases.  The  nonessential 
potential  of  x  equals  2  units  if  x  is  not  on  the  right  path  and  x's  left  child  has  the  same 
right  inner  height  as  x,  and  equals  0  otherwise. 

We  compute  the  amortized  cost  of  kernel  tree  operations.  Push  has  amortized  cost 
2,  to  provide  for  the  nonessential  potential  that  may  be  needed  by  the  parent  of  the 
inserted  node.  Consider  a  Pop.  Let  x  denote  the  lowest  node  on  the  left  path  such 
that  flog/d(x)]  <  rih(x).  Every  double  rotation  of  a  splay  step  that  involves  two 
nodes  with  identical  right  inner  heights  is  paid  for  using  the  nonessential  potential  of 
the  node  leaving  the  left  path.  The  number  of  remaining  double  rotations  is  at  most 
[ld{x)/2\  +  \\og{ld{x)+  1)].  The  first  term  accounts  for  the  double  rotations  involving 
two  proper  ancestors  of  i  and  the  second  term  accounts  for  the  double  rotations  involving 
the  descendents  of  i.  Each  latter  category  rotation  increases  the  potential  by  at  most 
1,  contributing  to  a  net  increase  of  at  most  flog(/d(i)  +  1)1  units  of  potential.  The 
halving  of  the  left  depths  of  the  ancestors  of  x  caused  by  the  splay  operation  decreases 
the  potential  by  exactly  ld{x)  -  1.  The  amortized  cost  of  Pop  is  therefore  bounded  by 

[ldix)/2\  +  2riog(/d(x)  +  1)1  -  ild{x)  -  1)  <  5. 

We  conclude  that  at  most  7n  double  rotations,  hence  at  most  15n  single  rotations, 
are  performed  by  the  sequence  of  operations  on  the  kernel  tree.  This  proves  the  Scanning 
Theorem. 
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3.4.2      An  Inductive  Proof  . 

In  this  section,  we  describe  an  inductive  proof  of  a  generalization  of  the  Scanning  The- 
orem. The  proof  technique  is  similar  to  the  method  used  to  derive  the  upper  bounds  in 

Section  3.2.1.  The  binarv  tree  is  partitioned  into  blocks  of  constant  size  so  that  the  total 

j 
number  of  single  rotations  within  blocks  is  linear.  The  induction  is  applied  to  a  global  ' 

tree  consisting  of  a  constant  fraction  of  the  tree  nodes.  Since  a  splay  on  the  original  tree 

translates  into  a  much  weaker  rotation  on  the  global  tree,  we  have  to  incorporate  the 

strength  of  the  rotations  into  the  inductive  hypothesis. 

We  state  the  result.  For  any  positive  integer  k  and  real  number  d.  such  that  1  < 
d  <  n.  a.  right  A:-twist  is  called  d-shallow  if  the  lowest  node  involved  in  the  twist  has  a 
left  depth  of  at  most  dk.  Let  S*'^'(r!)  denote  the  maximum  number  of  single  rotations 
performed  by  <f-shallow  right  twists  in  any  right  twist  sequence  executed  on  an  n-node 
binary  tree.  We  prove  that  5''^'(n)  =  0{dn).  The  Scanning  Theorem  follows  from 
5(2)(7i)  =  0{n). 

We  estimate  the  number  of  rf-shallow  right  twists  in  a  right  twist  sequence: 

Lemma  3.11  For  any  d  >  I,  the  total  number  of  d-shallow  right  twists  in  any  right 
twist  sequence  is  at  most  4dn. 

Proof.  Consider  any  d-shallow  right  twist  that  rotates  a  sequence  of  edges,  say 
[xi,yi],  [x2,y2],  •  •  M  i^kiyk],  such  that  the  left  depths  of  the  sequence  of  nodes  Xfc,  y^, 
Xfc_i,  j/fc_i,  . . .,  xi,  j/i  is  nonincreasing.  Let  ld{z)  and  ld'{z)  denote,  respectively,  the  left 
depths  of  any  node  z  before  and  after  the  twist.  For  all  i  €  [ffc/2],fc],  we  have 

Id'ixi)  i  t  1 

— ^— ^  =  1 <  1 <  1 . 

ld{x,)  Idix,)  -         kd-         2d 

In  order  to  pay  unit  cost  for  a  twist,  we  charge  each  node  i,,  such  that  i  6  [fA:/2'|,fc], 
min{2d//d(i,),  1}  debits.  Let  us  prove  that  the  total  charge  is  at  least  1.  Mld{xr<i2'\ )  < 
2d,  then  2;r.  /2I  is  charged  1  debit.  Otherwise,  we  have  1  >  2d/ld(x,)  >  2/k  for  all 
i  >  {k/2'\.  Since  \_k/2\  +  1  nodes  are  each  charged  at  least  2/k  debits,  the  net  charge  to 
all  the  nodes  is  at  least  1. 

Now,  we  bound  the  total  charge  to  a  node  over  the  entire  sequence.  Call  a  node  deep 
if  its  left  depth  is  greater  than  2d  and  shallow  otherwise.  Suppose  that  a  node  receives 


3.4.    NEW  PROOFS  OF  THE  SCANNING  THEOREM  69 

a  sequence  of  charges  2d/ Lk,2d/ L^-i 2d/Lo  while  it  is  deep.  Then 

2d 
L,  >  ; ■     for  all  i  >  0. 

(l-l/2d)' 

Therefore  the  total  charge  to  a  node  while  it  remains  deep  is  at  most 

(1  -  l/2d)''  +  (1  -  l/2d)''-^  +  ---+l<2d. 

A  node  receives  at  most  2d  debits  while  it  is  shallow.  This  implies  that  any  node  is 
charged  at  most  4d  debits,  giving  a  bound  of  4(in  for  the  total  number  of  <f-shallow  right 
twists.    D 

The  upper  bound  for  S^'^\n)  is  given  by: 

Theorem  3.8  S^'^^n)  <  87dn  for  all  d  >  I  and  n  >  \. 

Proof.  The  proof  uses  induction  on  n. 
Case 


1.  n<  llid:  S^'^Hn)  <  (  ^  )  ^  8"^"- 


Case  2.  n  >  174<f:  Divide  the  tree  into  a  sequence  of  fn/A']  blocks  such  that  each 
block  except  the  first  contains  exactly  A'  =  29d  nodes.  The  first  block  may  contain  fewer 
nodes.  In  each  block  except  the  first,  the  nodes  with  preorder  numbers  1  to  4d  within 
the  block  are  global.  The  first  block  does  not  contain  any  global  nodes.  Notice  that  the 
global  nodes  of  a  block  form  a  connected  subtree  within  the  block  whose  root  coincides 
with  the  root  of  the  block.  Further,  if  the  left  path  of  the  block  contains  more  than  Ad 
nodes  then  all  global  nodes  lie  on  the  left  path  of  the  block.  Otherwise  all  nodes  on  the 
left  path  of  the  block  are  global.  The  globed  nodes  in  the  tree  form  a  global  tree  as  in 
the  previous  upper  bound  constructions.  The  size  of  the  global  tree  is  at  most  n/7.25. 

We  analyze  the  effect  of  a  original  tree  rotation  on  the  global  tree.  An  interblock 
right  single  rotation  translates  into  a  corresponding  rotation  on  the  global  tree  if  both 
nodes  of  the  rotation  are  global.  Otherwise  the  global  tree  is  not  affected.  The  analysis 
of  an  intrablock  right  single  rotation  involves  the  following  cases: 

Local-local:  The  global  tree  is  unaffected. 

Local-global:  Again,  the  global  tree  is  not  affected,  but  the  global  role  is  transferred 
from  the  global  node  to  the  local  node. 
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Global-global:  Let  [x.p]  denote  the  rotated  edge  such  that  p  is  the  parent  of  x.  If 
the  left  subtree  of  x  within  the  block  contains  only  global  nodes,  then  the  rotation 
simply  propagates  to  the  global  tree.  Otherwise,  p's  global  role  is  transferred  to  the 
node,  say  x'.  in  p's  block  that  had  preorder  number  4d  +  1  initially.  Let  p'  denote 
the  lowest  ancestor  of  x'  in  the  original  tree  which  is  global.  The  effect  of  the 
transfer  of  globaJ  role  on  the  structure  of  the  global  tree  is  to  contract  edge  [x.p] 
and  add  a  new  edge  [x'.p'].  We  show  that  the  same  transformation  is  realizable 
through  a  series  of  right  single  rotations  in  the  global  tree.  These  rotations  are 
performed  by  traversing  the  path  from  p  to  p'  in  the  global  tree  as  follows  (See 
Figure  3.8): 

Start  at  edge  [p,x]  and  repeat  the  following  operation  until  the  last  edge 
on  the  path  is  reached:  If  the  next  edge  on  the  path  is  a  left  edge,  move 
to  the  next  edge;  otherwise,  rotate  the  current  edge  and  move  to  the 
next  edge  after  the  rotation.  Finally,  if  x'  belongs  to  the  right  subtree  of 
p'  in  the  original  tree,  rotate  the  last  global  tree  edge  traversed. 

Remark.  The  operation  performs  all  the  rotations  within  the  sub- 
tree of  the  global  tree  rooted  at  x.  This  is  seen  as  follows.  If  x  =  p', 
then  no  rotations  are  performed  on  the  global  tree.  Otherwise,  p'  lies  in 
the  left  subtree  of  x.  Hence  the  successor  of  edge  [p,  x]  on  the  global  tree 
path  from  p  to  p'  is  a  left  edge.  This  implies  that  the  operation  does  not 
rotate  edge  [p,  x].  Therefore  all  the  rotations  performed  by  the  operation 
occur  in  the  subtree  of  the  global  tree  rooted  at  x. 

At  any  during  this  traversal,  contracting  the  current  edge  results  in  a  tree  that  is 
identical  to  the  tree  obtained  by  contracting  the  edge  [x,p]  in  the  initial  tree,  so  it 
follows  that  the  above  series  of  rotations  on  the  global  tree  correctly  simulates  the 
rotation  of  edge  [x,p]. 

In  summary,  a  right  single  rotation  of  an  edge  [x.p],  such  that  p  is  the  parent  of  x, 
either  does  not  affect  the  global  tree,  or  translates  into  a  rotation  of  the  edge  [x.p]  in 
the  global  tree,  or  translates  into  a  sequence  of  right  single  rotations  on  the  subtree  of 
the  global  tree  rooted  at  x.  The  rotation  is  called  global  if  it  results  in  a  rotation  of  the 
corresponding  edge  in  the  global  tree  and  local  otherwise. 
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The  original  tree 


The  global  tree 


Figure  3.8:  The  transfer  of  global  role  in  an  intrablock  global-global  rotation.  Circles 
denote  the  nodes  of  the  block.  The  starred  nodes  of  the  original  tree  are  the  global 
nodes. 
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Consider  the  effect  of  a  right  twist  in  the  original  tree  on  the  global  tree.  The  sequence 
of  single  rotations  on  the  global  tree  caused  by  the  right  twist  comprises  l-rotations. 
caused  by  local  rotations  in  the  twist,  and  g-rotations,  caused  by  global  rotations  in  the 
twist.  The  nodes  involved  in  any  /-rotation  are  distinct  from  the  nodes  involved  in  any 
previous  g-rotation,  hence  we  may  transform  the  sequence  of  global  tree  rotations  by 
moving  each  /-rotation  before  all  the  ^(-rotations  without  altering  the  net  effect  of  the 
sequence  on  the  global  tree.  The  suffix  of  the  sequence  consisting  of  all  the  g-rotations 
defines  a  global  twist  on  the  global  tree.  In  summary,  the  effect  of  a  right  twist  in  the 
original  tree  on  the  global  tree  is  a  right  rotation  followed  by  a  global  twist  corresponding 
to  the  subsequence  of  global  rotations  in  the  twist. 

We  estimate  the  number  of  single  rotations  performed  by  d-shallow  twists  in  a  right 
twist  sequence  executed  on  the  tree.  Consider  any  rf-shallow  twist  in  the  sequence.  Define 
the  left  path  of  the  twist  to  be  the  left  path  resulting  from  the  contraction  of  the  right 
edges  on  the  access  path  of  the  lowest  node  involved  in  the  twist.  We  classify  the  right 
single  rotations  performed  by  the  twist  as  follows: 

Type  1.  Local,  interblock  rotation  in  which  the  top  node  is  global:  There  is  at  most 
one  Type-1  rotation  per  twist  because  the  left  subtree  of  the  bottom  node  of  the  rotation 
consists  of  only  local  nodes. 

Type  2.  Local,  interblock  rotation  in  which  the  top  node  is  local:  The  top  node 
lies  on  the  left  path  of  its  block  and,  since  the  node  is  local,  it  has  4d  global  ancestors 
within  the  block  that  lie  on  the  left  path  of  the  twist.  Notice  that  the  top  nodes  of 
different  Type-2  rotations  belong  to  different  blocks.  Thus,  if  k2  denotes  the  number  of 
Type-2  rotations  performed  by  the  twist,  then  the  left  path  of  the  twist  contains  at  least 
(4d+  1)^2  edges.  Since  the  number  of  edges  on  the  left  path  of  the  twist  is  bounded  by 
dk,  we  obtain  that  k2  <  [fc/4j. 

Type  3.  Local,  intrablock  rotation:  For  each  Type-3  rotation,  charge  (8/3)  debits 
to  the  block  in  which  the  rotation  is  performed.  If  the  number  of  Type-3  rotations  is  at 
leaist  {3k  -  4)/8,  the  total  charges  to  the  blocks  plus  a  charge  of  (4/3)  debits  to  the  twist 
itself  pays  for  all  the  rotations  performed  by  the  twist. 

Type  4.  Global  rotation:  Only  the  situation  where  the  number  of  Type-3  rotations 
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is  less  than  (3/:  -  4)/8  needs  to  be  considered.  In  this  case  at  least 

k-l-  [k/4}  -(\{3k  -  4)/8l  -l)  =  k-  [k/4\  -  [(3/c  +  3)/8j  >  3^78 

global  rotations  are  performed.  Therefore  the  global  twist  performs  at  least  3k/S  rota- 
tions on  the  global  tree,  and  it  is  (8<i/3)-shallow.  If  we  charge  each  such  global  twist 
(4/3)  times  the  actual  cost,  then  all  the  rotations  can  be  paid  for.  This  is  seen  as  follows. 
Let  kj  and  Ar4  denote,  respectively,  the  number  of  Type-3  and  Type-4  rotations.  Then, 
^'3  +  ^4  >  3fc/4  -  1.  The  total  charge  is  8^-3/3  +  4/3  +  Ak^/S  which  is  minimized  when 
ks  =  0.  When  kj  =  0,  the  total  charge  is  at  least  4/3  +  (4/3)(3/:/4  -  I)  >  k. 

Since  each  d-shallow  twist  is  charged  at  most  4/3  debits,  the  total  charge  to  all  the 
d-shallow  twists  is  at  most  l6dn/3  by  Lemma  3.11.  The  total  charge  to  a  block  of 
size  s  is  at  most  8        1/3.   It  follows  that  the  total  charge  to  all  the  blocks  is  at  most 

AnKjZ  <  I16dn/Z.  By  the  inductive  hypothesis,  the  total  charge  to  all  the  (8d/3)- 
shallow  global  twists  is  at  most  (4/3)(87)(8rf/3)(Ti/7.25)  =  USdn/S.  Therefore  the  sum 
total  of  all  the  charges  is  bounded  by  \6dn/3  +  ll6dn/3+  12Sdn/3  <  8~dn,  completing 
the  induction  step.    D 
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Chapter  4 

Testing  Set  Equality 


The  problem  of  maintaining  a  dynamic  collection  of  sets  under  various  operations  arises 
in  numerous  applications.  A  natural  application  is  the  implementation  of  high-level  pro- 
gramming languages  like  SETL  that  support  sets  and  permit  operations  such  as  equality, 
membership,  union,  intersection,  etc.  on  them.  The  general  problem  of  efficiently  main- 
taining sets  under  all  of  these  operations  appears  quite  difficult.  This  chapter  describes  a 
fast  data  structure  for  maintaining  sets  under  equality- tests  and  under  creations  of  new 
sets  through  insertions  and  deletions  of  elements^. 

4.1      Introduction 

The  Set  Equality-testing  Problem  is  to  maintain  a  collection  of  sets  over  a  finite,  ordered 
universe  under  the  following  operations: 

•  EQUAL(5,r):  Test  \i  S  =  T. 

•  Insert(5,x,T):  Create  anew  set  T  =  5u  {x}. 

•  Delete(5,  i,T):  Create  a  new  set  T  =  5\{x}. 

The  collection  initially  contains  just  the  empty  set.  We  would  like  to  devise  a  data 
structure  for  this  problem  that  tests  equality  of  sets  in  constant  time  and  executes  the 
remaining  operations  as  fast  as  possible,  under  this  constraint. 


'The  work  of  this  chapter  was  reported  in  a  joint-paper  with  Robert  E.  Tarjan  [31]. 
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If  sets  are  represented  by  unique  storage  structures,  then  equality-testing  of  a  pair  of 
sets  can  be  implemented  in  constant  time  by  just  checking  whether  they  are  represented 
by  a  single  storage  structure;  uniqueness  simply  means  that  all  the  instances  of  a  set 
are  represented  by  a  single  storage  structure.  Following  this  natural  approach,  several 
people  have  devised  unique  storage  representations  for  sets  that  allow  constant  time 
equality-tests  and  can  be  updated  efficiently.  Wegman  and  Carter  [35]  gave  a  randomized 
signature  representation  for  sets  that  can  be  updated  in  constant  time  and  constant 
space  but  errs  with  a  small  probability  during  equality-tests.  Pugh  [26]  and  Pugh  and 
Tietelbaum  [27]  gave  an  error-free  randomized  binary  trie  representation  for  sets  that 
can  be  updated  in  O(logn)  expected  time  and  O(\ogn)  expected  space,  where  n  denotes 
the  size  of  the  updated  set.  Their  data  structures  also  support  union  and  intersection  of 
sets,  although  less  efficiently.  Yellin  [40]  gave  a  deterministic  binary  trie  representation 
of  sets  that  can  be  updated  in  (9(log^m)  time  and  O(logm)  space,  where  m  denotes  the 
total  number  of  updates. 

We  devise  a  deterministic  data  structure  for  the  Set  Equality-testing  Problem  re- 
quiring O(logm)  amortized  time  and  (3(logm)  space  per  update  operation.  The  data 
structure  is  based  on  a  solution  to  a  more  fundamental  problem  involving  S-expressions. 
S-expressions  [5]  constitute  the  staple  data  type  of  programming  language  LISP.  An 
S-expression  is  either  an  atom  (signifying  a  number  or  a  character  string)  or  a  pair 
of  S-expressions.  An  atom  S-expression  is  represented  in  storage  by  a  node;  a  pair  S- 
expression  is  represented  by  a  node  with  left  and  right  pointers  that  point  to  nodes 
representing  the  component  S-expressions.  We  store  S-expressions  uniquely,  i.e.  all  in- 
stances of  an  S-expression  are  represented  by  a  single  node.  CoNS(5i,52)  returns  the 
S-expression  (51.52).  A  cascade  of  CoNS  operations  is  a  sequence  of  CoNS  operations  in 
which  the  result  of  each  Cons  operation  is  an  input  to  the  next  Cons  operation.  For 
instance, 

si     :=     CoNs(5o,<o) 

52       :=       CONS(Si,<i) 

5/     :=     CoNS(5/_i,</_i) 

is  a  cascade  of  /  Cons  operations.  The  S-expression  problem  in  question  is  to  devise 
a  data  structure  for  efficiently  implementing  cascades  of  CONS  operations  on  uniquely 
stored  S-expressions. 
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Unique  storage  of  S-expressions  makes  Cons  operations  expensive.  Given  a  pair 
of  S-expressions,  a  Cons  operation  has  to  check  whether  there  is  a  third  S-expression 
in  the  collection  with  these  S-expressions  as  its  component  S-expressions.  Viewing  the 
collection  of  S-expressions  as  a  dictionary,  this  is  equivalent  to  performing  a  search 
operation,  possibly  followed  by  an  insertion,  on  the  dictionary.  Single  Cons  operations 
can  be  implemented  in  0{\/\og  F)  time  and  0(1)  amortized  space  or,  alternately,  in 
0(1)  time  and  O(F')  space,  where  F  denotes  the  total  number  of  Cons  operations 
performed  and  e  is  any  positive  constant.  This  implementation  is  based  on  Willard's 
data  structure  [37]  for  maintaining  a  dictionary  in  a  small  universe.  Universal  hashing 
[8]  and  dynamic  perfect  hashing  [13]  offer  alternate  implementations  that  require  0(1) 
randomized  amortized  time  and  0(1)  amortized  space  per  Cons  operation. 

We  develop  a  data  structure  that  performs  a  cascade  of  /  CoNS  operations  in 
0(f  +  logmc)  amortized  time,  where  rric  denotes  the  total  number  of  cascades  per- 
formed. The  total  space  used  is  proportional  to  the  number  of  distinct  S-expressions 
present.  This  means  that  CONS  operations  can  be  implemented  in  constant  amortized 
time  and  constant  space  in  situations  where  these  operations  occur  in  long  cascades. 
Our  set-equality-testing  data  structure  is  an  immediate  corollary  of  this  result.  When 
sets  are  represented  by  binary  tries,  an  update  operation  translates  into  a  cascade  of 
at  most  logm  Cons  operations  and  requires  O(logm)  amortized  time  using  this  data 
structure.  Many  list-oriented  functions  in  functional  languages  (LISP,  for  instance)  in- 
volve CcLscades  of  CoNS  operations  and  can  be  implemented  efficiently  using  this  method; 
function  Append  is  a  typical  example: 


APPEND([ri,  rj, . . . ,  Vk],  [wi,  W2,...,  wi])  = 
Si  :=     CoNS(t;jt,[u;i,u;2,---,ty/]) 

32  :=      CotiS{Vk-i,3i) 

Result     :=     CoNS(vi,5it_i) 


The  chapter  is  organized  as  follows.  In  Sections  4.2  and  4.3,  we  describe  the  data 
structure  for  equality-testing  of  sets  and  analyze  its  performance.  In  Section  4.4,  we 
discuss  directions  for  further  work. 
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4.2     The  Data  Structure 

We  reduce  the  Set  Equality-testing  Problem  to  the  problem  of  implementing  cascades 
of  Cons  operations  on  uniquely  stored  S-expressions.  The  elements  seen  so  far  are 
numbered  in  serial  order  and  define  the  current  universe  U  =  [l,\U\].  Each  set  is 
represented  by  a  binary  trie  [21]  in  this  universe.  The  binary  trie  representing  a  set 
S  is  an  S-expression  that  stores  the  elements  of  5  as  atoms  and  is  defined  recursively. 
Let  2P  <  |(^|  <  2^"^^.  A  singleton  set  is  represented  by  an  atom  and  the  empty  set, 
by  the  atom  NIL.  If  \S\  >  2,  then  5  is  represented  by  a  pair  (si.52).  where  Si  and  52 
are,  respectively,  the  S-expressions  representing  subsets  5  PI  [1,2^]  and  5  fl  [2''  +  1,  |r'|] 
in  their  respective  subuniverses.  We  store  S-expressions  uniquely  so  that  two  sets  are 
equal  if  and  only  if  their  S-expressions  are  represented  by  a  single  node.  A  set  update 
operation  translates  into  a  cascade  of  at  most  loglt^^l  <  logm  CONS  operations,  which 
can  be  implemented  in  O(logm)  amortized  time  and  O(logm)  space  using  a  method 
described  below;  m  denotes  the  total  number  of  update  operations. 

We  describe  an  efficient  data  structure  for  performing  cascades  of  Cons  operations 
on  uniquely  stored  S-expressions.  The  data  structure  requires  0(f  +  logmc)  amortized 
time  to  perform  a  cascade  of  /  Cons  operations,  where  nic  denotes  the  total  number  of 
cascades  performed.  Consider  the  collection  of  nodes  representing  S-expressions.  Num- 
ber these  nodes  serially  in  their  order  of  creation.  A  parent  of  a  node  v  is  defined  to 
be  a  node  that  points  to  v.  Each  node  v  maintains  a  set  parents(u)  of  all  its  parents. 
Each  parent  p  6  parents(t;)  is  assigned  a  key  equal  to  (seriaJ#(u)),6),  where  w  is  the 
other  node  (besides  v)  pointed  to  by  p,  and  6  equals  0  or  1  depending  on  whether  the 
left  pointer  of  p  points  to  v  or  not.  To  perform  a  CoNS  operation  on  two  nodes,  v  and 
w,  we  search  the  set  parents(i;)  using  the  key  (serial#(u;),0)  and  return  the  matching 
parent.  If  there  is  no  matching  parent,  we  create  a  new  node  p  with  pointers  to  v  and 
w,  set  parents(p)  to  empty,  insert  p  into  parents(r)  and  parents(w),  and  return  p.  In  a 
cascade  of  Cons  operations,  we  implement  each  Cons  operation  by  searching  in  the  set 
of  parents  of  the  node  returned  by  the  previous  Cons  operation. 

We  represent  each  set  parents(?;)  by  a  binary  search  tree  and  perform  searches  and 
insertions  on  the  tree  using  the  Splay  Algorithm^.   A  search  operation  is  followed  by  a 


•'The  Splay  Algorithm  is  described  in  Chapter  3,  Section  3.1.1. 
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splay  on  the  last-visited  node  during  the  search.  A  new  element  is  inserted  into  the  tree 
as  follows.  If  the  inserted  element  is  larger  than  the  current  maximum  element,  insert 
it  as  the  right  child  of  the  maximum  element;  this  requires  maintaining  a  pointer  to  the 
rightmost  node  in  the  tree.  Otherwise  insert  the  element  into  the  tree  in  the  standard 
top-down  manner  and  then  splay  at  the  element.  These  two  types  of  insertions  are  called 
passive  and  active,  respectively.  We  implement  passive  insertions  more  efficiently  since 
they  are  more  numerous  than  active  insertions. 

4.3     The  Analysis 

The  following  theorem  summarizes  the  performance  of  the  data  structure  for  CONS 
operations. 

Theorem  4.1  The  amortized  cost  of  a  cascade  of  f  Cons  operations  equals  0(f  + 
log  rric),  where  rric  denotes  the  total  number  of  cascades  performed  on  S- expressions. 

The  key  idea  behind  the  proof  of  this  theorem  is  to  bound  the  cost  of  operations  on 
a  parent  set  using  a  strong  form  of  Sleator  and  Tarjans  Static  Optimality  Theorem  [28]. 
We  focus  on  the  graph  induced  by  the  S-expression  nodes,  write  the  static  optimality 
expressions  for  all  these  nodes,  and  bound  the  sum  of  the  static  optimality  expressions 
over  all  the  nodes,  using  the  fact  that  S-expression  nodes  have  at  most  two  children  (even 
though  they  might  have  unboundedly  many  parents). 

We  state  the  lemmas  used  in  proving  the  theorem.  The  following  lemma  uses  the 
notion  of  blocks  in  a  binary  tree  introduced  in  Chapter  .3,  Section  3.2.1,  and  occurs 
implicitly  in  the  work  of  Cole  et  al.  [11]. 

Lemma  4.1  Consider  a  binary  search  tree  whose  fUments  have  been  assigned  arbitrary 
nonnegative  weights.  Suppose  that  the  tree  is  partitioned  into  blocks  so  that  each  block 
has  a  positive  weight  (the  weight  of  a  block  equals  thr  total  weight  of  all  the  elements  in 
it).  Let  n  denote  the  number  of  elements  in  the  tree  and  let  nj,  denote  the  number  of 
blocks.  The  cost  of  a  sequence  of  m  splays  performed  on  the  roots  of  the  blocks  equals 
0{m  +  n  +  J2T=\  ^og(W/wj)  +  Xir=i  \og(W/w,)),  where  W  =  the  total  weight  of  all  the 
elements,  Wj  =  the  weight  of  the  block  of  the  jth  accessed  element,  and  u),  =  the  weight 
of  the  ith  block  of  the  tree. 
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Proof.  Assign  potentials  to  the  nodes  of  the  tree  as  described  by  Cole  et  al.  [11] 
in  Section  2.  "Global  insertions",  and  analyze  the  splays  using  their  analysis  of  global 
insertions''.  Their  analysis  yields  the  following  conclusions:  the  amortized  cost  of  a  splay 
on  the  root  of  a  block  with  weight  w  equals  0(1  +\og{W/u'));  the  drop  in  potential  over 
the  entire  sequence  equals  OiY^^l^  log(iy/it',)  +  n).  The  result  follows.    D 

The  following  lemma  bounds  the  cost  of  the  sequence  of  operations  performed  on  a 
single  parent  set  and  it  is  the  key  idea  underlying  the  analysis. 

Lemma  4.2    Consider  a  sequence  of  insertions  and  searches  performed  on  an  (initially 

empty)  binary  search  tree  using  splays.  Let 

/,       =      the  number  of  searches  of  element  i, 

F       =     the  total  number  of  searches. 

Ha      =     the  number  of  active  insertions,  and 

n       =     the  total  number  of  insertions. 
The  cost  of  this  sequence  equals  0{n  +  Ua  logng  +  F  +  J2f,>i  /■  log(^//!))- 

Proof.  We  modify  the  sequence  by  preinserting  all  the  elements  into  the  initial  tree 
according  to  their  order  of  arrival  (without  splaying).  On  this  tree,  we  perform  the 
searches  and  simulate  the  insertions.  Active  insertions  are  simulated  by  splaying  at  the 
corresponding  elements  and  passive  insertions  are  simply  ignored.  We  obtain  a  sequence 
of  splays  corresponding  to  active  insertions  and  searches  {active  splays  and  hot  splays, 
respectively).  It  suffices  to  bound  the  cost  of  this  sequence. 

We  bound  the  cost  of  this  sequence  by  partitioning  the  tree  into  blocks  and  applying 
Lemma  4.1.  Partition  the  tree  into  blocks  as  follows.  The  elements  accessed  by  active 
and  hot  splays  are,  respectively,  called  active  and  hot.  Every  active  or  hot  element 
forms  a  singleton  block.  Each  nonempty  interval  of  nodes  between  consecutive  singleton 
blocks  forms  a  passive  block.  Choose  an  element  from  each  passive  block  and  call  it 
the  block  representative.  Note  that  na  =  the  number  of  active  elements.  The  weight  of 
element  i  is  defined  by: 

/,•  if  the  element  is  hot 

•f/("o  +  1)       if  the  element  is  active  but  not  hot 
0  if  the  element  is  in  a  passive  block 

but  not  the  representative 


'An  account  of  this  analysis  can  also  be  found  in  Cole  [9],  Section  4. 
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The  representatives  of  Tia  +  l  of  the  passive  blocks  are  each  assigned  a  weight  of  F/(na  +  l); 
the  representatives  of  the  remaining  passive  blocks  are  placed  in  one-to-one  correspon- 
dence with  the  set  of  hot  elements  and  assigned  the  weights  of  their  mates.  The  total 
weight  of  the  tree  is  at  most  4F.  Applying  Lemma  4.1,  the  cost  of  the  sequence  of  splays 
equals 

0(na  +  F  +  n+  Y.  /.log(4f//,)  +  n,log(4(n<,  +  l))  + 
2(^  log(4F//,))  +  (2n,  +  l)log(4(n,  +  1)))  = 
0(n  +  nJogna  +  F+  ^ /.  log(F//.)). 

D 

Remark.  The  lemma  is  a  strong  form  of  Sleator  and  Tarjan's  Static  Optimality 
Theorem  [28].  The  term  static  optimality  comes  from  the  expression  X3, /.TogCF//,) 
which  gives  the  weighted  path  length  of  the  optimal  static  binary  tree  whose  leaves  have 
weights  /i, /2,  •  • -i/n-  Their  theorem  applies  only  to  sequences  of  searches  in  which  all 
the  elements  of  the  tree  are  accessed  at  least  once.  The  use  of  Cole  et  al.'s  sharper 
analysis  [11]  yielded  our  stronger  lemma. 

The  following  graph  inequality  will  help  us  to  bound  the  sum  of  the  static  optimality 
expressions  over  the  nodes  of  the  S-expression  graph,  using  the  fact  that  the  nodes  of 
this  graph  have  constant-bounded  indegrees. 

Lemma  4.3   Consider  a  digraph  G  =  [V,E)  and  consider  a  collection  of  walks  in  G. 

Let 
Fg       =     the  number  of  traversals  of  edge  e  in  the  walks, 

Fv  =  E(v,w)GE^{v.w),  for  any  vertex  V, 

W^  =  the  number  of  walks  originating  at  vertex  v,  and 

idy  =  indegree{v)-\- \. 

Then, 

Y,       ^(.,u,)log(Fv//^(v,u.))  <    L  f^v\0gld,  +    Y    ^^vlog^v- 
(v,w)eE  v6V  F,>1 

Proof.  Let  /'(v.u-)  =  F(^.u,)  -  #walks  with  {v,w)  as  the  last  edge. 
Y     F(,,^)log(F„/F(,,,))     =      ^F,logF,+     Y.     ^(v,u,)log(l/F„,^)) 

(v,w)eE  F„>\  (v,w)eE 
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<  Y2    ^^'-  l°g  ^v+       H       ^(r.v)  log  n  +        Y.       ^(t'.u.)  log(  1/F(.,u0  ) 

Fi,>i  {T.v]eE  {v.w]eE 

{x  log(l/a-)  is  decreasing  in  [1/e,  oc]) 

Fv>l  F^>l{v.w)eE 

<  y^  Vr^logFi,  +  ^  F^,\ogid^.    (entropy  inequality).    D 

F„>i  wev 


We  are  ready  to  prove  the  theorem. 

Proof  of  Theorem  4.1.  Consider  a  sequence  of  m^-  cascades  of  CoNS  operations, 
comprising  F  Cons  operations  totally.  The  cost  of  a  cascade  of/  Cons  operations  equals 
0{f)  plus  the  cost  of  operations  performed  on  parent  sets.  During  any  cascade  of  Cons 
operations,  there  are  at  most  two  active  insertions  into  parent  sets.  These  insertions  are 
performed  when  the  first  node  is  created  by  the  cascade;  all  subsequent  insertions  are 
passive.  Hence,  out  of  at  most  2F  insertions  into  parent  sets  totally  performed  during  the 
sequence  of  cascades,  at  most  2mc  insertions  are  active  insertions.  Applying  Lemma  4.2 
to  the  sequence  of  insertions  and  searches  performed  on  each  parent  set  and  summing 
the  costs  over  all  parent  sets,  we  see  that  the  total  cost  of  parent  set  operations  equals 


0(F  +  m,logm,+     ^  ^  f,log(F(v)/ /,)), 

nodes  t;    ,•  g  parents(i;) 
A   /,  >  1 


where  F(v)  denotes  the  total  number  of  searches  performed  on  parents(i;)  and  /,  denotes 
the  number  of  searches  of  element  i  among  these.  The  double  summation  bounds  the 
total  cost  of  all  searches  performed  on  the  parent  sets.  This  summation  can  be  bounded 
using  Lemma  4.3.  The  collection  of  nodes  at  the  end  of  the  sequence  of  cascades  induces 
a  directed  graph  whose  vertices  are  the  S-expression  nodes  and  whose  edges  go  from 
nodes  to  their  parents.  The  indegree  of  each  vertex  in  this  graph  is  at  most  2.  For  each 
edge  {v,w),  define  F(v,u;)  =  the  number  of  searches  of  node  w  performed  on  parents(t;). 
Delete  all  edges  e  such  that  Fe  =  0.  Applying  Lemma  4.3  to  the  resulting  graph,  we  see 
that  the  summation  is  bounded  by  /"logS  +  m^  log mc-  It  follows  that  the  cost  of  the 
sequence  of  cascades  equals  0{F  +  mc  log mc).  The  theorem  follows.    D 
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4.4      Directions  for  Further  Work 

The  following  open  problems  arise  naturally  in  connection  with  this  work: 

1.  Is  there  a  data  structure  for  implementing  Cons  operations  in  constant  amortized 
time  and  constant  amortized  space,  in  general? 

2.  Prove  (or  disprove)  that  the  problem  of  maintaining  sets  under  the  complete  reper- 
toire of  set  operations  has  no  efficient  solution.  An  efficient  solution  is  one  that 
implements  all  set  operations  in  time  polylogarithmic  in  the  number  of  update 
operations. 

3.  The  Sequence  Equality-testing  Problem  [31]  is  to  maintain  a  collection  of  sequences 
from  a  finite,  ordered  universe  under  equality-tests  and  under  creations  of  new  se- 
quences through  insertions  and  deletions  of  elements.  There  exists  a  data  structure 
that  performs  equality-tests  of  sequences  in  constant  time  and  updates  sequences 
in  about  y/n  time/space,  where  n  denotes  the  length  of  the  updated  sequence. 
The  problem  can  be  solved  in  O(logm)  time/space  per  update  operation  if  either 
sequences  are  repetition-free  or  randomization  and  a  small  error  are  permitted;  m 
denotes  the  number  of  update  operations.  The  existence  of  a  deterministic  (or  even 
an  error-free  randomized)  data  structure  that  updates  sequences  in  polylogarithmic 
time/space,  in  general,  remains  open. 
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